xfs
[Top] [All Lists]

[PATCH] xfs: create zero range helper to convert existing extents to unw

To: xfs@xxxxxxxxxxx
Subject: [PATCH] xfs: create zero range helper to convert existing extents to unwritten
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Thu, 9 Oct 2014 15:00:41 -0400
Delivered-to: xfs@xxxxxxxxxxx
The zero range mechanism makes multiple xfs_alloc_file_space() calls to
try convert as much of a range as possible to unwritten in the event of
allocation failure. The problem is that the subsequent conversion call
will progress no further than the first preallocate call if the latter
happened to fail, because we have no ability to do extent conversion
without preallocation.

Create a helper to iterate a range of a bmap and convert all existing,
normal extents to unwritten. Call the helper from xfs_zero_file_space()
to accomplish the goal specified above. This allows zero range to put
off preallocation to the final step and thus ensures the range is zero
valued before any significant preallocation is attemped.

Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
---

Here's a stab at the kind of helper I was referring to in the review
discussion of the zero range rework. This applies on top of v3:

http://oss.sgi.com/archives/xfs/2014-10/msg00149.html

Lightly tested, but I'm throwing some workloads at it now and so far, so
good. Thoughts appreciated.

Brian

 fs/xfs/xfs_bmap_util.c | 145 +++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 129 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index a69df91..937528c 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -793,6 +793,119 @@ next_block:
 }
 
 /*
+ * Convert all normal extents in a range to unwritten. This function does not
+ * preallocate. Instead, seek out existing extents and convert to unwritten.
+ * This is a helper to zero-value a range of a file. All extents other than
+ * normal, non-delalloc extents are skipped and must be handled separately.
+ */
+static int
+xfs_bmap_convert_unwritten_range(
+       struct xfs_inode        *ip,
+       xfs_fileoff_t           start_fsb,
+       xfs_fileoff_t           length)
+{
+       xfs_fileoff_t           cur_fsb;
+       xfs_fileoff_t           end_fsb;
+       int                     error = 0;
+       struct xfs_mount        *mp = ip->i_mount;
+       struct xfs_trans        *tp;
+       int                     committed;
+
+       cur_fsb = start_fsb;
+       end_fsb = start_fsb + length;   /* exclusive end */
+
+       while (cur_fsb < end_fsb) {
+               xfs_bmbt_irec_t imap;
+               int             nimaps = 1;
+               xfs_fsblock_t   firstblock;
+               xfs_bmap_free_t flist;
+               xfs_fileoff_t   count;
+
+               tp = xfs_trans_alloc(mp, XFS_TRANS_DIOSTRAT);
+               /*
+                * We only convert existing extents. Reserve only what blocks
+                * might be necessary for an extent split.
+                */
+               error = xfs_trans_reserve(tp, &M_RES(mp)->tr_write,
+                                         XFS_DIOSTRAT_SPACE_RES(mp, 0), 0);
+               if (error) {
+                       xfs_trans_cancel(tp, 0);
+                       break;
+               }
+
+               xfs_ilock(ip, XFS_ILOCK_EXCL);
+
+               /* loop until we find a normal extent */
+               do {
+                       error = xfs_bmapi_read(ip, cur_fsb, end_fsb - cur_fsb,
+                                       &imap, &nimaps, XFS_BMAPI_ENTIRE);
+                       if (error) {
+                               /* something screwed, just bail */
+                               if (!XFS_FORCED_SHUTDOWN(ip->i_mount)) {
+                                       xfs_alert(ip->i_mount,
+                               "Failed mapping lookup ino %lld fsb %lld.",
+                                                       ip->i_ino, cur_fsb);
+                               }
+                               break;
+                       }
+
+                       if (!nimaps) {
+                               cur_fsb++;
+                               continue;
+                       }
+
+                       /* look for normal extents */
+                       if (imap.br_state == XFS_EXT_NORM &&
+                           imap.br_startblock != DELAYSTARTBLOCK &&
+                           imap.br_startblock != HOLESTARTBLOCK)
+                               break;
+
+                       cur_fsb = imap.br_startoff + imap.br_blockcount;
+               } while (cur_fsb < end_fsb);
+
+               /* check whether we failed or found nothing to convert */
+               if (error || cur_fsb >= end_fsb) {
+                       xfs_iunlock(ip, XFS_ILOCK_EXCL);
+                       goto out_cancel;
+               }
+
+               xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
+
+               /*
+                * convert to the end of the extent or the end of the range,
+                * whichever is shorter
+                */
+               count = imap.br_blockcount - (cur_fsb - imap.br_startoff);
+               count = min_t(xfs_fileoff_t, count, end_fsb - cur_fsb);
+
+               xfs_bmap_init(&flist, &firstblock);
+
+               error = xfs_bmapi_write(tp, ip, cur_fsb, count,
+                                       XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT,
+                                       &firstblock, 0, &imap, &nimaps, &flist);
+               if (error)
+                       goto out_cancel;
+
+               error = xfs_bmap_finish(&tp, &flist, &committed);
+               if (error)
+                       goto out_cancel;
+
+               error = xfs_trans_commit(tp, XFS_TRANS_RELEASE_LOG_RES);
+               if (error)
+                       break;
+
+               /* set cur fsb to start of next extent */
+               cur_fsb = imap.br_startoff + imap.br_blockcount;
+       }
+
+       return error;
+
+out_cancel:
+       xfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES | XFS_TRANS_ABORT);
+       return error;
+}
+
+/*
  * Test whether it is appropriate to check an inode for and free post EOF
  * blocks. The 'force' parameter determines whether we should also consider
  * regular files that are marked preallocated or append-only.
@@ -1433,28 +1546,28 @@ xfs_zero_file_space(
        }
 
        /*
-        * First, preallocate the unaligned range to ensure the entire range is
-        * allocated. This does not convert extents because we cannot convert
-        * partial blocks. Second, convert the block-aligned range to unwritten
-        * extents. We do this as such because passing the entire range for
-        * allocation facilitates ideal contiguity from the allocator.
+        * At this point partial blocks are consistent on-disk and in pagecache
+        * and delalloc mappings are consistent between pagecache (buffer heads)
+        * and the in-core extent tree. We can proceed with allocation without
+        * worry of failure leaving inconsistencies.
         *
-        * Note that we attempt unwritten conversion even if we fail to
-        * preallocate space. This is an effort to convert as many extents as
-        * possible before we inevitably return ENOSPC.
-        *
-        * XXX: A helper that converts existing extents but does not preallocate
-        * would help clean all this up. We could convert extents first,
-        * preallocate second, and thus guarantee a zero-value range on ENOSPC.
+        * First, convert existing extents in the block aligned range to
+        * unwritten. Next, preallocate the unaligned range to fill in the gaps
+        * and cover partial start and end blocks. This sequence means that the
+        * range has likely been zero-valued even if we run out of space.
         */
+       if (end_boundary > start_boundary) {
+               error = xfs_bmap_convert_unwritten_range(ip,
+                               XFS_B_TO_FSBT(mp, start_boundary),
+                               XFS_B_TO_FSB(mp,
+                                            end_boundary - start_boundary));
+               if (error)
+                       goto out;
+       }
        error = xfs_alloc_file_space(ip, round_down(offset, blksize),
                                     round_up(offset + len, blksize) -
                                     round_down(offset, blksize),
                                     XFS_BMAPI_PREALLOC);
-       if (end_boundary > start_boundary)
-               error = xfs_alloc_file_space(ip, start_boundary,
-                                       end_boundary - start_boundary,
-                                       XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT);
 
 out:
        return error;
-- 
1.8.3.1

<Prev in Thread] Current Thread [Next in Thread>
  • [PATCH] xfs: create zero range helper to convert existing extents to unwritten, Brian Foster <=