[PATCH 10/27] xfs: improve sync behaviour in the fact of aggressive dirt

Subject: [PATCH 10/27] xfs: improve sync behaviour in the fact of aggressive dirtying
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Wed, 29 Jun 2011 10:01:19 -0400
The following script from Wu Fengguang shows very bad behaviour in XFS
when aggressively dirtying data during a sync on XFS, with sync times
up to almost 10 times as long as ext4.

A large part of the issue is that XFS writes data out itself two times
in the ->sync_fs method, overriding the lifelock protection in the core
writeback code, and another issue is the lock-less xfs_ioend_wait call,
which doesn't prevent new ioend from beeing queue up while waiting for
the count to reach zero.

This patch removes the XFS-internal sync calls and relies on the VFS
to do it's work just like all other filesystems do.  Note that the
i_iocount wait which is rather suboptimal is simply removed here.
We already do it in ->write_inode, which keeps the current supoptimal
behaviour.  We'll eventually need to remove that as well, but that's
material for a separate commit.

------------------------------ snip ------------------------------

umount /dev/sda7
mkfs.xfs -f /dev/sda7
# mkfs.ext4 /dev/sda7
# mkfs.btrfs /dev/sda7
mount /dev/sda7 /fs

echo $((50<<20)) > /proc/sys/vm/dirty_bytes

for i in `seq 10`
        dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 &
        pid="$pid $!"

sleep 1

tic=$(date +'%s')
tac=$(date +'%s')

echo sync time: $((tac-tic))
egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo

pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }
------------------------------ snip ------------------------------

Reported-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
Signed-off-by: Christoph Hellwig <hch@xxxxxx>

Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c        2011-06-29 11:26:14.109219361 
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c     2011-06-29 11:37:20.642275110 +0200
@@ -359,14 +359,12 @@ xfs_quiesce_data(
        int                     error, error2 = 0;
-       /* push non-blocking */
-       xfs_sync_data(mp, 0);
        xfs_qm_sync(mp, SYNC_TRYLOCK);
-       /* push and block till complete */
-       xfs_sync_data(mp, SYNC_WAIT);
        xfs_qm_sync(mp, SYNC_WAIT);
+       /* force out the newly dirtied log buffers */
+       xfs_log_force(mp, XFS_LOG_SYNC);
        /* write superblock and hoover up shutdown errors */
        error = xfs_sync_fsdata(mp);

