[Top] [All Lists]

[PATCH 2/2] xfs: reclaim all inodes by background tree walks

To: xfs@xxxxxxxxxxx
Subject: [PATCH 2/2] xfs: reclaim all inodes by background tree walks
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 7 Jan 2010 10:05:25 +1100
In-reply-to: <1262819125-27083-1-git-send-email-david@xxxxxxxxxxxxx>
References: <1262819125-27083-1-git-send-email-david@xxxxxxxxxxxxx>
We cannot do direct inode reclaim without taking the flush lock to
ensure that we do not reclaim an inode under IO. We check the inode
is clean before doing direct reclaim, but this is not good enough
because the inode flush code marks the inode clean once it has
copied the in-core dirty state to the backing buffer.

It is the flush lock that determines whether the inode is still
under IO, even though it is marked clean, and the inode is still
required at IO completion so we can't reclaim it even though it is
clean in core. Hence the requirement that we need to take the
flush lock even on clean inodes because this guarantees that the
inode writeback IO has completed and it is safe to reclaim the

With delayed write inode flushing, we coul dend up waiting a long
time on the flush lock even for a clean inode. The background
reclaim already handles this efficiently, so avoid all the problems
by killing the direct reclaim path altogether.

Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx>
 fs/xfs/linux-2.6/xfs_super.c |   14 ++++++--------
 fs/xfs/linux-2.6/xfs_sync.c  |   11 ++++++++++-
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index f3dd67d..23768f4 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -953,16 +953,14 @@ xfs_fs_destroy_inode(
        ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));
-        * If we have nothing to flush with this inode then complete the
-        * teardown now, otherwise delay the flush operation.
+        * we always use background reclaim here because even if the
+        * inode is clean, it still may be under IO and hence we have
+        * to take the flush lock. The background reclaim path handles
+        * this more efficiently than we can here, so simply let background
+        * reclaim tear down all inodes.
-       if (!xfs_inode_clean(ip)) {
-               xfs_inode_set_reclaim_tag(ip);
-               return;
-       }
-       xfs_ireclaim(ip);
+       xfs_inode_set_reclaim_tag(ip);
diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index 6d1cd6e..a1d7876 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -700,6 +700,8 @@ xfs_reclaim_inode(
         * In the case of a forced shutdown we rely on xfs_iflush() to
         * wait for the inode to be unpinned before returning an error.
+        * Because we hold the flush lock, we know that the inode cannot
+        * be under IO, so if it reports clean it can be reclaimed.
        if (!is_bad_inode(VFS_I(ip)) && !xfs_inode_clean(ip)) {
@@ -726,9 +728,16 @@ xfs_reclaim_inode(
        return 0;
+       /*
+        * We could return EAGAIN here to make reclaim rescan the inode tree in
+        * a short while. However, this just burns CPU time scanning the tree
+        * waiting for IO to complete and xfssyncd never goes back to the idle
+        * state. Instead, return 0 to let the next scheduled background reclaim
+        * attempt to reclaim the inode again.
+        */
        xfs_iflags_clear(ip, XFS_IRECLAIM);
        xfs_iunlock(ip, XFS_ILOCK_EXCL);
-       return EAGAIN;
+       return 0;

<Prev in Thread] Current Thread [Next in Thread>