xfs
[Top] [All Lists]

[PATCH 05/27] xfs: don't do IO when creating an new inode

To: xfs@xxxxxxxxxxx
Subject: [PATCH 05/27] xfs: don't do IO when creating an new inode
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 12 Jun 2013 20:22:25 +1000
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1371032567-21772-1-git-send-email-david@xxxxxxxxxxxxx>
References: <1371032567-21772-1-git-send-email-david@xxxxxxxxxxxxx>
From: Dave Chinner <dchinner@xxxxxxxxxx>

When we are allocating a new inode, we read the inode cluster off
disk to increment the generation number. We are already using a
random generation number for newly allocated inodes, so if we are not
using the ikeep mode, we can just generate a new generation number
when we initialise the newly allocated inode.

This avoids the need for reading the inode buffer during inode
creation. This will speed up allocation of inodes in cold, partially
allocated clusters as they will no longer need to be read from disk
during allocation. It will also reduce the CPU overhead of inode
allocation by not having the process the buffer read, even on cache
hits.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
 fs/xfs/xfs_inode.c |   36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 7f7be5f..d1f76da 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1028,6 +1028,11 @@ xfs_dinode_calc_crc(
 
 /*
  * Read the disk inode attributes into the in-core inode structure.
+ *
+ * If we are initialising a new inode and we are not utilising the
+ * XFS_MOUNT_IKEEP inode cluster mode, we can simple build the new inode core
+ * with a random generation number. If we are keeping inodes around, we need to
+ * read the inode cluster to get the existing generation number off disk.
  */
 int
 xfs_iread(
@@ -1047,6 +1052,22 @@ xfs_iread(
        if (error)
                return error;
 
+       /* shortcut IO on inode allocation if possible */
+       if ((iget_flags & XFS_IGET_CREATE) &&
+           !(mp->m_flags & XFS_MOUNT_IKEEP)) {
+               /* initialise the on-disk inode core */
+               memset(&ip->i_d, 0, sizeof(ip->i_d));
+               ip->i_d.di_magic = XFS_DINODE_MAGIC;
+               ip->i_d.di_gen = prandom_u32();
+               if (xfs_sb_version_hascrc(&mp->m_sb)) {
+                       ip->i_d.di_version = 3;
+                       ip->i_d.di_ino = ip->i_ino;
+                       uuid_copy(&ip->i_d.di_uuid, &mp->m_sb.sb_uuid);
+               } else
+                       ip->i_d.di_version = 2;
+               return 0;
+       }
+
        /*
         * Get pointers to the on-disk inode and the buffer containing it.
         */
@@ -1133,17 +1154,16 @@ xfs_iread(
        xfs_buf_set_ref(bp, XFS_INO_REF);
 
        /*
-        * Use xfs_trans_brelse() to release the buffer containing the
-        * on-disk inode, because it was acquired with xfs_trans_read_buf()
-        * in xfs_imap_to_bp() above.  If tp is NULL, this is just a normal
+        * Use xfs_trans_brelse() to release the buffer containing the on-disk
+        * inode, because it was acquired with xfs_trans_read_buf() in
+        * xfs_imap_to_bp() above.  If tp is NULL, this is just a normal
         * brelse().  If we're within a transaction, then xfs_trans_brelse()
         * will only release the buffer if it is not dirty within the
         * transaction.  It will be OK to release the buffer in this case,
-        * because inodes on disk are never destroyed and we will be
-        * locking the new in-core inode before putting it in the hash
-        * table where other processes can find it.  Thus we don't have
-        * to worry about the inode being changed just because we released
-        * the buffer.
+        * because inodes on disk are never destroyed and we will be locking the
+        * new in-core inode before putting it in the cache where other
+        * processes can find it.  Thus we don't have to worry about the inode
+        * being changed just because we released the buffer.
         */
  out_brelse:
        xfs_trans_brelse(tp, bp);
-- 
1.7.10.4

<Prev in Thread] Current Thread [Next in Thread>