On Fri, Dec 11, 2009 at 06:46:27AM -0500, Christoph Hellwig wrote:
> On Fri, Dec 11, 2009 at 11:43:53AM +1100, Dave Chinner wrote:
> > > > + spin_lock(&mp->m_perag_lock);
> > > > + pag = radix_tree_lookup(&mp->m_perag_tree, agno);
> > > > + spin_unlock(&mp->m_perag_lock);
> > > > + return pag;
> > >
> > > Can't we do this as a lock-less (at least for lookups) radix tree?
> > I think it can be (RCU-based?) , but I think that makes sense as a
> > followup optimisation once we have confidence the code is working
> > as it should.
> Nick, what are the rules for the lock-less radix tree reader side?
OK, well basically we can do a radix_tree_lookup and get the pointer
without locking. In order to be able to follow the pointer to the
object of course the calling code needs to do its own synchronization
(eg. objects might be RCU protected as well so it can be used or a
refcount can be taken on it under rcu_read_lock).
A single lookup basically has concurrency semantics as though it may
have been performed before or after any concurrent modifications. This
is really just the same as:
ptr = radix_tree_lookup()
/* use the ptr */
Because before the lock is taken and after it is released, we can get
any kind of interleaving of modifications anyway, so by the time we
use the pointer then regardless of whether we use lock or RCU, then the
tree may have been modified anyway.
This looks like the pattern used in Dave's patch.
If you need any kind of atomic multiple lookups (including gang lookups)
or atomic lookups and modifications, then you'll need a lock.
> Dave is introducing a radix tree in XFS which has the following access
> - lots of read side access during normal fs operations
> - insertations currently only happen during mount before the fs is life
> and during a very rare operation (filesystem resize)
> - currently items are never deleted, but we might need that in the
> future (for filesystem shrink support)
> - the objects pointed to are kmalloced and refcounted structures,
> but we don't even strictly need the refcounting until the filesystem
> shrink support is implemented
This sounds like it would be perfectly suited to RCU lookups.