On Tue, Nov 09, 2010 at 09:12:42PM -0800, Paul E. McKenney wrote:
> On Tue, Nov 09, 2010 at 04:04:17PM +1100, Dave Chinner wrote:
> > On Mon, Nov 08, 2010 at 07:36:28PM -0800, Paul E. McKenney wrote:
> > > On Mon, Nov 08, 2010 at 06:09:29PM -0500, Christoph Hellwig wrote:
> > > > This patch generally looks good to me, but with so much RCU magic I'd
> > > > prefer
> > > > if Paul & Eric could look over it.
> > >
> > > Is there a git tree, tarball, or whatever?
> > git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev.git working
> Thank you -- I have downloaded this and will look it over.
fs/xfs/xfs_iget.c is the place to start - that's where the inode
cache lookups occur...
> Once the C++ guys get done grilling me on memory-model issues...
> > contains the series that this patch is in.
> > > For example, I don't see
> > > how this patch handles the case of an inode being freed just as an RCU
> > > reader gains a reference to it,
> > XFS_IRECLAIM flag is set on inodes as they transition into the
> > reclaim state long before they are freed. The XFS_IRECLAIM flag is left
> > there once
> > freed. Hence lookups in xfs_iget_cache_hit() will see this.
> > If the inode has been reallocated, the inode number will not yet be
> > set, or the inode state will have changed to XFS_INEW, both of which
> > xfs_iget_cache_hit() will also reject.
> > > but then reallocated as some other inode
> > > (so that ->ino is nonzero) before the RCU reader gets a chance to actually
> > > look at the inode.
> > XFS_INEW is not cleared until well after a new ->i_ino is set, so
> > the lookup should find trip over XFS_INEW in that case. I think that
> > I may need to move the inode number check under the i_flags_lock
> > after validating the flags - more to check that we've got the
> > correct inode than to validate we have a freed inode.
> OK, this sounds promising. Of course, the next question is "how quickly
> can the inode number be available for reuse?"
Immediately. Indeed, an inode number can be reused even before the
inode is reclaimed. However, looking at the case of having already
freed the inode when the new lookup comes in, I think checking
everything under the i_flags_lock is safe.
That is, if we've freed inode #X (@ &A) and find &A during the RCU
protected lookup for inode #X, the only way the inode number in the
structure at &A would match #X is that if the new #X was reallocated
at &A again. In that case, if the inode wasn't fully set up, we'd
find either XFS_INEW|XFS_IRECLAIM still set on it and we'd back off
and try the lookup again. However, if inode #X was reallocated at
address &B then the inode at &A would not match #X regardless of
whether &A had been reallocated or not.
Hence I think checking the inode number under the i_flags_lock after
checking XFS_INEW|XFS_IRECLAIM are not set is sufficient to validate
we have both an active inode and the correct inode.