Re: inode_permission NULL pointer dereference in 3.13-rc1

To: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: inode_permission NULL pointer dereference in 3.13-rc1
From: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date: Fri, 29 Nov 2013 06:59:41 +0000
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20131129041416.GV10323@xxxxxxxxxxxxxxxxxx>
References: <20131127100906.GA19740@xxxxxxxxxxxxx> <20131128162618.GO10323@xxxxxxxxxxxxxxxxxx> <20131128212301.GP10323@xxxxxxxxxxxxxxxxxx> <20131128225102.GS10988@dastard> <20131128234441.GQ10323@xxxxxxxxxxxxxxxxxx> <CA+55aFxLZxy75fO4ZXO4Stiu1sMx1q=eJ7HSk-UTCX61jPrirA@xxxxxxxxxxxxxx> <20131129024121.GS10323@xxxxxxxxxxxxxxxxxx> <20131129035939.GT10323@xxxxxxxxxxxxxxxxxx> <20131129040658.GU10323@xxxxxxxxxxxxxxxxxx> <20131129041416.GV10323@xxxxxxxxxxxxxxxxxx>
Sender: Al Viro <viro@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Nov 29, 2013 at 04:14:16AM +0000, Al Viro wrote:

> And yes, it has fixed the problem with generic/234.  I'll do full xfstests
> run to see if there's anything else, but this one is obviously needed.
> I'll send it with sane commit message (along with follow_dotdot_rcu()
> fix) later tonight.  path_init() race is a separate story - that one should
> probably go separately, since we'll want it in all branches starting with
> early 2011 or so.

OK, it survives.  However, looking a bit more at follow_dotdot_rcu()...
AFAICS, we have a narrow oopsable race, from 2.6.38 and to 3.12 - think what
happens if we are walking through /tmp/foo/bar/../baz in RCU mode and we'd just
reached /tmp/foo/bar.  handle_dotdot() is called, calling follow_dotdot_rcu().
OK, we are not about to cross a mountpoint.  Read ->d_seq of /tmp/foo into
seq, check that nd->seq matches /tmp/foo/bar (it does, everything's fine)
and set nd->path.dentry to /tmp/foo, with nd->seq set to seq.  Then
we check if the /tmp/foo is overmounted by something; it isn't and now we set

Sure, it's _very_ hard to get into trouble here - we need somebody to remove
/tmp/foo/bar *and* /tmp/foo while we'd been walking vfsmount hash,
but in theory it is not impossible to get NULL nd->inode.  Then
link_path_walk() gets to checking that we have a directory and we get
an oops on checking inode flags.

I really don't like the way we have nd->inode updates scattered all over
the place in fs/namei.c ;-/  I'm looking into possible ways to deal
with it sanely, but that'll have to wait for tomorrow...

Anyway, I've pushed the minimal regression fix to vfs.git; please, pull
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Al Viro (1):
      fix bogus path_put() of nd->root after some unlazy_walk() failures

 fs/namei.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

