xfs
[Top] [All Lists]

[PATCH 07/18] xfs: don't use vfs writeback for pure metadata modificatio

To: xfs@xxxxxxxxxxx
Subject: [PATCH 07/18] xfs: don't use vfs writeback for pure metadata modifications
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 14 Sep 2010 20:56:06 +1000
In-reply-to: <1284461777-1496-1-git-send-email-david@xxxxxxxxxxxxx>
References: <1284461777-1496-1-git-send-email-david@xxxxxxxxxxxxx>
From: Dave Chinner <dchinner@xxxxxxxxxx>

Under heavy multi-way parallel create workloads, the VFS struggles to write
back all the inodes that have been changed in age order. The bdi flusher thread
becomes CPU bound, spending 85% of it's time in the VFS code, mostly traversing
the superblock dirty inode list to separate dirty inodes old enough to flush.

We already keep an index of all metadata changes in age order - in the AIL -
and continued log pressure will do age ordered writeback without any extra
overhead at all. If there is no pressure on the log, the xfssyncd will
periodically write back metadata in ascending disk address offset order so will
be very efficient.

Hence we can stop marking VFS inodes dirty during transaction commit or when
changing timestamps during transactions. This will keep the inodes in the
superblock dirty list to those containing data or unlogged metadata changes.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
 fs/xfs/linux-2.6/xfs_iops.c |   18 +++++-------------
 fs/xfs/xfs_inode_item.c     |    9 ---------
 2 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index 1e084ff..8f21765 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -95,9 +95,11 @@ xfs_mark_inode_dirty(
 }
 
 /*
- * Change the requested timestamp in the given inode.
- * We don't lock across timestamp updates, and we don't log them but
- * we do record the fact that there is dirty information in core.
+ * Change the requested timestamp in the given inode.  We don't lock across
+ * timestamp updates, and we don't log them directly.  However, all timestamp
+ * changes occur within transactions that log the inode core, so the timestamp
+ * changes will be copied back into the XFS inode during transaction commit.
+ * Hence we do not need to dirty the inode here.
  */
 void
 xfs_ichgtime(
@@ -106,27 +108,17 @@ xfs_ichgtime(
 {
        struct inode    *inode = VFS_I(ip);
        timespec_t      tv;
-       int             sync_it = 0;
 
        tv = current_fs_time(inode->i_sb);
 
        if ((flags & XFS_ICHGTIME_MOD) &&
            !timespec_equal(&inode->i_mtime, &tv)) {
                inode->i_mtime = tv;
-               sync_it = 1;
        }
        if ((flags & XFS_ICHGTIME_CHG) &&
            !timespec_equal(&inode->i_ctime, &tv)) {
                inode->i_ctime = tv;
-               sync_it = 1;
        }
-
-       /*
-        * Update complete - now make sure everyone knows that the inode
-        * is dirty.
-        */
-       if (sync_it)
-               xfs_mark_inode_dirty_sync(ip);
 }
 
 /*
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index fe00777..c7ac020 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -223,15 +223,6 @@ xfs_inode_item_format(
        nvecs        = 1;
 
        /*
-        * Make sure the linux inode is dirty. We do this before
-        * clearing i_update_core as the VFS will call back into
-        * XFS here and set i_update_core, so we need to dirty the
-        * inode first so that the ordering of i_update_core and
-        * unlogged modifications still works as described below.
-        */
-       xfs_mark_inode_dirty_sync(ip);
-
-       /*
         * Clear i_update_core if the timestamps (or any other
         * non-transactional modification) need flushing/logging
         * and we're about to log them with the rest of the core.
-- 
1.7.1

<Prev in Thread] Current Thread [Next in Thread>