xfs
[Top] [All Lists]

[PATCH] xfs: improve sync behaviour in face of aggressive dirtying

To: xfs@xxxxxxxxxxx
Subject: [PATCH] xfs: improve sync behaviour in face of aggressive dirtying
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Fri, 17 Jun 2011 09:14:01 -0400
Cc: Wu Fengguang <fengguang.wu@xxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
The following script from Wu Fengguang shows very bad behaviour in XFS
when aggressively dirtying data during a sync on XFS, with sync times
up to almost 10 times as long as ext4.

A large part of the issue is that XFS writes data out itself two times
in the ->sync_fs method, overriding the lifelock protection in the core
writeback code, and another issue is the lock-less xfs_ioend_wait call,
which doesn't prevent new ioend from beeing queue up while waiting for
the count to reach zero.

This patch removes the XFS-internal sync calls and relies on the VFS
to do it's work just like all other filesystems do, and just uses the
internal inode iterator to wait for pending ioends, but now with the
iolock held.  With this fix we improve the sync time quite a bit,
but we'll need further work split i_iocount between pending direct I/O
requests that are only relevant for truncate, and pending unwritten
extent conversion that matter for sync and fsync.

------------------------------ snip ------------------------------
#!/bin/sh

umount /dev/sda7
mkfs.xfs -f /dev/sda7
# mkfs.ext4 /dev/sda7
# mkfs.btrfs /dev/sda7
mount /dev/sda7 /fs

echo $((50<<20)) > /proc/sys/vm/dirty_bytes

pid=
for i in `seq 10`
do
        dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 &
        pid="$pid $!"
done

sleep 1

tic=$(date +'%s')
sync
tac=$(date +'%s')

echo
echo sync time: $((tac-tic))
egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo

pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }
------------------------------ snip ------------------------------

Reported-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
Signed-off-by: Christoph Hellwig <hch@xxxxxx>

Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c        2011-06-17 14:16:18.442399481 
+0200
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c     2011-06-17 14:18:06.632394003 +0200
@@ -215,6 +215,19 @@ xfs_inode_ag_iterator(
 }
 
 STATIC int
+xfs_wait_ioend_cb(
+       struct xfs_inode        *ip,
+       struct xfs_perag        *pag,
+       int                     flags)
+{
+       xfs_ilock(ip, XFS_IOLOCK_SHARED);
+       xfs_ioend_wait(ip);
+       xfs_iunlock(ip, XFS_IOLOCK_SHARED);
+
+       return 0;
+}
+
+STATIC int
 xfs_sync_inode_data(
        struct xfs_inode        *ip,
        struct xfs_perag        *pag,
@@ -359,14 +372,15 @@ xfs_quiesce_data(
 {
        int                     error, error2 = 0;
 
-       /* push non-blocking */
-       xfs_sync_data(mp, 0);
        xfs_qm_sync(mp, SYNC_TRYLOCK);
-
-       /* push and block till complete */
-       xfs_sync_data(mp, SYNC_WAIT);
        xfs_qm_sync(mp, SYNC_WAIT);
 
+       /* wait for all pending unwritten extent conversions */
+       xfs_inode_ag_iterator(mp, xfs_wait_ioend_cb, 0);
+
+       /* force out the newly dirtied log buffers */
+       xfs_log_force(mp, XFS_LOG_SYNC);
+
        /* write superblock and hoover up shutdown errors */
        error = xfs_sync_fsdata(mp);
 

<Prev in Thread] Current Thread [Next in Thread>