xfs
[Top] [All Lists]

Re: [PATCH] xfs: avoid synchronous transaction in xfs_fs_write_inode

To: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: avoid synchronous transaction in xfs_fs_write_inode
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 18 Jun 2010 09:51:13 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20100615112051.GA25064@xxxxxxxxxxxxx>
References: <20100608195905.GA577@xxxxxxxxxxxxx> <20100609072911.GI7869@dastard> <20100615112051.GA25064@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Jun 15, 2010 at 07:20:51AM -0400, Christoph Hellwig wrote:
> On Wed, Jun 09, 2010 at 05:29:11PM +1000, Dave Chinner wrote:
> > All great - I attempted this myself - but it breaks bulkstat. See
> > xfstest 183:
> 
> I can't reproduce the failure.  And then I don't think we should
> prioritize bulkstate over metadata performance.  The only major users
> of bulkstat on XFS in the mainline kernel is xfsdump, and metadata
> performance during normal loads is a lot more important.  And the
> difference when using LVM/MD will be even larger with this as barriers
> are a lot more expensive there.
> 
> What do you think about adding something like the patch below which
> always makes bulkstat always use the iget version which doesn't show
> this sort of problems.
> 
> ---
> 
> From: Christoph Hellwig <hch@xxxxxx>
> Subject: xfs: always use iget in bulkstat
> 
> The non-coherent bulkstat versionsthat look directly at the inode
> buffers causes various problems with performance optimizations that
> make increased use of just logging inodes.  This patch makes bulkstat
> always use iget, which should be fast enough for normal use with the
> radix-tree based inode cache introduced a while ago.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>

An important note: I think this patch is necessary to avoid bulkstat
returning unlinked inodes when cache thrashing is occurring. The
current non-coherent lookup will incorrectly return unlinked inodes
if the inode cluster is marked stale, reclaimed and turfed from
memory due to an unlink and memory pressure between the ialloc btree
lookup and the cluster buffer read.

The only way to avoid this is to ensure that the ialloc btree lookup
and the formatting of the inode into the user buffer is atomic, which
means we have to do a coherent lookup that returns a locked into....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>