[PATCH 07/18] xfs: don't use vfs writeback for pure metadata modifications

Dave Chinner david at fromorbit.com
Tue Sep 14 19:28:28 CDT 2010


On Tue, Sep 14, 2010 at 05:12:17PM -0500, Alex Elder wrote:
> On Tue, 2010-09-14 at 20:56 +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner at redhat.com>
> > 
> > Under heavy multi-way parallel create workloads, the VFS struggles to write
> > back all the inodes that have been changed in age order. The bdi flusher thread
> > becomes CPU bound, spending 85% of it's time in the VFS code, mostly traversing
> > the superblock dirty inode list to separate dirty inodes old enough to flush.
> > 
> > We already keep an index of all metadata changes in age order - in the AIL -
> > and continued log pressure will do age ordered writeback without any extra
> > overhead at all. If there is no pressure on the log, the xfssyncd will
> > periodically write back metadata in ascending disk address offset order so will
> > be very efficient.
> 
> So log pressure will cause the logged updates to the inode to be
> written to disk (in order), which is all we really need.  Is that
> right?

Yes. And if there is no log pressure, xfssyncd will do the writeback
in an disk order efficient manner.

> Therefore we don't need to rely on the VFS layer to get
> the dirty inode pushed out?

No. Indeed, for all other types of metadata (btree blocks,
directory/attribute blocks, etc) we already rely on the
xfsaild/xfsbufd to write them out in a timely manner because the VFS
knows nothing about them.

> Is writeback the only reason we should inform the VFS that an
> inode is dirty?  (Sorry, I have to leave shortly and don't have
> time to follow this at the moment--I may have to come back to
> this later.)

Yes, pretty much. Take your time - this is one of the more radical
changes in the patch set...

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com




More information about the xfs mailing list