On Tue, Nov 09, 2010 at 04:04:17PM +1100, Dave Chinner wrote:
> On Mon, Nov 08, 2010 at 07:36:28PM -0800, Paul E. McKenney wrote:
> > On Mon, Nov 08, 2010 at 06:09:29PM -0500, Christoph Hellwig wrote:
> > > This patch generally looks good to me, but with so much RCU magic I'd
> > > prefer
> > > if Paul & Eric could look over it.
> > Is there a git tree, tarball, or whatever?
> git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev.git working
Thank you -- I have downloaded this and will look it over.
Once the C++ guys get done grilling me on memory-model issues...
> contains the series that this patch is in.
> > For example, I don't see
> > how this patch handles the case of an inode being freed just as an RCU
> > reader gains a reference to it,
> XFS_IRECLAIM flag is set on inodes as they transition into the
> reclaim state long before they are freed. The XFS_IRECLAIM flag is left there
> freed. Hence lookups in xfs_iget_cache_hit() will see this.
> If the inode has been reallocated, the inode number will not yet be
> set, or the inode state will have changed to XFS_INEW, both of which
> xfs_iget_cache_hit() will also reject.
> > but then reallocated as some other inode
> > (so that ->ino is nonzero) before the RCU reader gets a chance to actually
> > look at the inode.
> XFS_INEW is not cleared until well after a new ->i_ino is set, so
> the lookup should find trip over XFS_INEW in that case. I think that
> I may need to move the inode number check under the i_flags_lock
> after validating the flags - more to check that we've got the
> correct inode than to validate we have a freed inode.
OK, this sounds promising. Of course, the next question is "how quickly
can the inode number be available for reuse?"
> > But such a check might well be in the code that this
> > patch didn't change...
> Yeah, most of the XFS code is already in a form compatible with such
> RCU use because inodes have always had a quiescent "reclaimable"
> state between active and reclaim (XFS_INEW -> active ->
> XFS_IRECLAIMABLE -> XFS_IRECLAIM) where the inode can be reused
> before being freed. The result is that lookups have always had to
> handle races with inodes that have just transitioned into the
> XFS_IRECLAIM state and hence cannot be immediately reused...