On Tue, 2010-09-14 at 20:56 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> Under heavy multi-way parallel create workloads, the VFS struggles to write
> back all the inodes that have been changed in age order. The bdi flusher
> becomes CPU bound, spending 85% of it's time in the VFS code, mostly
> the superblock dirty inode list to separate dirty inodes old enough to flush.
> We already keep an index of all metadata changes in age order - in the AIL -
> and continued log pressure will do age ordered writeback without any extra
> overhead at all. If there is no pressure on the log, the xfssyncd will
> periodically write back metadata in ascending disk address offset order so
> be very efficient.
So log pressure will cause the logged updates to the inode to be
written to disk (in order), which is all we really need. Is that
right? Therefore we don't need to rely on the VFS layer to get
the dirty inode pushed out?
Is writeback the only reason we should inform the VFS that an
inode is dirty? (Sorry, I have to leave shortly and don't have
time to follow this at the moment--I may have to come back to
> Hence we can stop marking VFS inodes dirty during transaction commit or when
> changing timestamps during transactions. This will keep the inodes in the
> superblock dirty list to those containing data or unlogged metadata changes.
The code looks fine to me, but I don't know whether the
change it implements is correct or not without digging
in a little deeper.
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> fs/xfs/linux-2.6/xfs_iops.c | 18 +++++-------------