On Tue, Sep 14, 2010 at 08:56:04PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> With delayed logging greatly increasing the sustained parallelism of inode
> operations, the inode cache locking is showing significant read vs write
> contention when inode reclaim runs at the same time as lookups. There is
> also a lot more write lock acquistions than there are read locks (4:1 ratio)
> so the read locking is not really buying us much in the way of parallelism.
That's just for your parallel creates workload, isn't it? If we'd get
that bad hit rates on normal workloads something is pretty wrong with
the inode cache.
For a workload with 4 times as many writelocks just changing the
rwlock to a spinlock should provide more benefits. Did you test what
effect this has on other workloads?
In addition to that I feel really uneasy about changes to the inode
cache locking without really heavy NFS server testing - we just had too
many issues in this area in the past.
> To avoid the read vs write contention, change the cache to use RCU locking on
> the read side. To avoid needing to RCU free every single inode, use the built
> in slab RCU freeing mechanism. This requires us to be able to detect lookups
> freed inodes, so en??ure that ever freed inode has an inode number of zero and
> the XFS_IRECLAIM flag set. We already check the XFS_IRECLAIM flag in cache hit
> lookup path, but also add a check for a zero inode number as well.
How does this interact with slab poisoning?