xfs
[Top] [All Lists]

Re: [PATCH v2] xfs: Avoid pathological backwards allocation

To: xfs@xxxxxxxxxxx
Subject: Re: [PATCH v2] xfs: Avoid pathological backwards allocation
From: Jan Kara <jack@xxxxxxx>
Date: Mon, 20 May 2013 15:56:07 +0200
Cc: dchinner@xxxxxxxxxx, tinguely@xxxxxxx, Jan Kara <jack@xxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1365710996-16439-1-git-send-email-jack@xxxxxxx>
References: <1365710996-16439-1-git-send-email-jack@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu 11-04-13 22:09:56, Jan Kara wrote:
> Writing a large file using direct IO in 16 MB chunks sometimes results
> in a pathological allocation pattern where 16 MB chunks of large free
> extent are allocated to a file in a reversed order. So extents of a file
> look for example as:
> 
>  ext logical physical expected length flags
>    0        0        13          4550656
>    1  4550656 188136807   4550668 12562432
>    2 17113088 200699240 200699238 622592
>    3 17735680 182046055 201321831   4096
>    4 17739776 182041959 182050150   4096
>    5 17743872 182037863 182046054   4096
>    6 17747968 182033767 182041958   4096
>    7 17752064 182029671 182037862   4096
> ...
> 6757 45400064 154381644 154389835   4096
> 6758 45404160 154377548 154385739   4096
> 6759 45408256 252951571 154381643  73728 eof
> 
> This happens because XFS_ALLOCTYPE_THIS_BNO allocation fails (the last
> extent in the file cannot be further extended) so we fall back to
> XFS_ALLOCTYPE_NEAR_BNO allocation which picks end of a large free
> extent as the best place to continue the file. Since the chunk at the
> end of the free extent again cannot be further extended, this behavior
> repeats until the whole free extent is consumed in a reversed order.
> 
> For data allocations this backward allocation isn't beneficial so make
> xfs_alloc_compute_diff() pick start of a free extent instead of its end
> for them. That avoids the backward allocation pattern.
> 
> See thread at http://oss.sgi.com/archives/xfs/2013-03/msg00144.html for
> more details about the reproduction case and why this solution was
> chosen.
> 
> Based on idea by Dave Chinner <dchinner@xxxxxxxxxx>.
> 
> CC: Dave Chinner <dchinner@xxxxxxxxxx>
> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
> Signed-off-by: Jan Kara <jack@xxxxxxx>
> ---
>  fs/xfs/xfs_alloc.c |   24 ++++++++++++++++++------
>  1 files changed, 18 insertions(+), 6 deletions(-)
> 
> v2: Updated comment and commit description.
  Could anybody pull this patch into XFS tree? I don't see it there...

                                                                Honza

> 
> diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
> index 0ad2325..f99113d 100644
> --- a/fs/xfs/xfs_alloc.c
> +++ b/fs/xfs/xfs_alloc.c
> @@ -173,6 +173,7 @@ xfs_alloc_compute_diff(
>       xfs_agblock_t   wantbno,        /* target starting block */
>       xfs_extlen_t    wantlen,        /* target length */
>       xfs_extlen_t    alignment,      /* target alignment */
> +     char            userdata,       /* are we allocating data? */
>       xfs_agblock_t   freebno,        /* freespace's starting block */
>       xfs_extlen_t    freelen,        /* freespace's length */
>       xfs_agblock_t   *newbnop)       /* result: best start block from free */
> @@ -187,7 +188,14 @@ xfs_alloc_compute_diff(
>       ASSERT(freelen >= wantlen);
>       freeend = freebno + freelen;
>       wantend = wantbno + wantlen;
> -     if (freebno >= wantbno) {
> +     /*
> +      * We want to allocate from the start of a free extent if it is past
> +      * the desired block or if we are allocating user data and the free
> +      * extent is before desired block. The second case is there to allow
> +      * for contiguous allocation from the remaining free space if the file
> +      * grows in the short term.
> +      */
> +     if (freebno >= wantbno || (userdata && freeend < wantend)) {
>               if ((newbno1 = roundup(freebno, alignment)) >= freeend)
>                       newbno1 = NULLAGBLOCK;
>       } else if (freeend >= wantend && alignment > 1) {
> @@ -772,7 +780,8 @@ xfs_alloc_find_best_extent(
>                       xfs_alloc_fix_len(args);
>  
>                       sdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> -                                                    args->alignment, *sbnoa,
> +                                                    args->alignment,
> +                                                    args->userdata, *sbnoa,
>                                                      *slena, &new);
>  
>                       /*
> @@ -943,7 +952,8 @@ restart:
>                       if (args->len < blen)
>                               continue;
>                       ltdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> -                             args->alignment, ltbnoa, ltlena, &ltnew);
> +                             args->alignment, args->userdata, ltbnoa,
> +                             ltlena, &ltnew);
>                       if (ltnew != NULLAGBLOCK &&
>                           (args->len > blen || ltdiff < bdiff)) {
>                               bdiff = ltdiff;
> @@ -1095,7 +1105,8 @@ restart:
>                       args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen);
>                       xfs_alloc_fix_len(args);
>                       ltdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> -                             args->alignment, ltbnoa, ltlena, &ltnew);
> +                             args->alignment, args->userdata, ltbnoa,
> +                             ltlena, &ltnew);
>  
>                       error = xfs_alloc_find_best_extent(args,
>                                               &bno_cur_lt, &bno_cur_gt,
> @@ -1111,7 +1122,8 @@ restart:
>                       args->len = XFS_EXTLEN_MIN(gtlena, args->maxlen);
>                       xfs_alloc_fix_len(args);
>                       gtdiff = xfs_alloc_compute_diff(args->agbno, args->len,
> -                             args->alignment, gtbnoa, gtlena, &gtnew);
> +                             args->alignment, args->userdata, gtbnoa,
> +                             gtlena, &gtnew);
>  
>                       error = xfs_alloc_find_best_extent(args,
>                                               &bno_cur_gt, &bno_cur_lt,
> @@ -1170,7 +1182,7 @@ restart:
>       }
>       rlen = args->len;
>       (void)xfs_alloc_compute_diff(args->agbno, rlen, args->alignment,
> -                                  ltbnoa, ltlena, &ltnew);
> +                                  args->userdata, ltbnoa, ltlena, &ltnew);
>       ASSERT(ltnew >= ltbno);
>       ASSERT(ltnew + rlen <= ltbnoa + ltlena);
>       ASSERT(ltnew + rlen <= 
> be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length));
> -- 
> 1.7.1
> 
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR

<Prev in Thread] Current Thread [Next in Thread>