xfs
[Top] [All Lists]

Re: [QUESTION] about the freelist allocator in XFS

To: xfs@xxxxxxxxxxx
Subject: Re: [QUESTION] about the freelist allocator in XFS
From: Kaho Ng <ngkaho1234@xxxxxxxxx>
Date: Mon, 11 Jul 2016 00:57:59 +0800
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=5eV3Vs9USpi4veBOk8RqefdYhdeJ9gW5tkSAbyuT4Zc=; b=j7mC/O094LZMXtVrnTjVnkgx8Gh9VODLCLhs6cabR8fksbJD34QRoV0jTTFYO+c+dj BMoODXBnCrMv8ZrShsm/zZlsG+kT1IbcTR15ZZdd3Qs7CBB0bUNEief9Xvhdxb2j2dT5 LXy6j2A71rAWbLZoSJxp8dI29iaSDXASMn//OxwzR0c9ckuXluqNXp4S2bzeB/ywI6a+ vQ+TIE0cZw1TkeVLJSP4xs1jzJp+vmqsBUB18ncacbdnxH8LZxRgt5fw0yXMn2yAqGin T4EDlKf+MVYyuSFVWGN6T7GT/1tFnUPyZsptpO/Crs8EFBKZQie6PH5Anor47yBIRAXo ZKBA==
In-reply-to: <CAGeO4WMZtYjaW=L7Hj8CgTd-sO38-xAUMhkZ-x-Z394YjOO7Xg@xxxxxxxxxxxxxx>
References: <CAGeO4WMZtYjaW=L7Hj8CgTd-sO38-xAUMhkZ-x-Z394YjOO7Xg@xxxxxxxxxxxxxx>
well, a piece of comment about the corner case i mentioned is found in
xfsprogs/repair/phase5.c, but i still have no idea how that is
prevented by the xfs kernel module.

/*
 * We need to leave some free records in the tree for the corner case of
 * setting up the AGFL. This may require allocation of blocks, and as
 * such can require insertion of new records into the tree (e.g. moving
 * a record in the by-count tree when a long extent is shortened). If we
 * pack the records into the leaves with no slack space, this requires a
 * leaf split to occur and a block to be allocated from the free list.
 * If we don't have any blocks on the free list (because we are setting
 * it up!), then we fail, and the filesystem will fail with the same
 * failure at runtime. Hence leave a couple of records slack space in
 * each block to allow immediate modification of the tree without
 * requiring splits to be done.
 *
 * XXX(hch): any reason we don't just look at mp->m_alloc_mxr?
 */

On Thu, Jul 7, 2016 at 7:01 PM, Kaho Ng <ngkaho1234@xxxxxxxxx> wrote:
> I am trying to investigate how freelist allocator in xfs interacts
> with freespace B+Tree allocator.
> First I prepared a patch
> <https://gist.github.com/22ffca35929e67c08759b057779b7566> on
> linux-source/fs/xfs/libxfs/xfs_alloc.c to print debugging messages
> (The kernel version used is linux-3.10.0-327.22.2.el7).
> Then, I wrote a simple utility
> <https://gist.github.com/992364ceca984d3f14099ec94aaacd9d> to make
> TONS of
> holes in a filesystem by calling fallocate() to punch holes in a file
> that is almost as large as the volume size.
>
> I created an XFS filesystem image by the following steps:
> 1. fallocate -l 80G /mnt/disk2/xfs
> 2. mkfs.xfs -f -d agcount=1 /mnt/disk2/xfs
>
> Then I created a large file by fallocate:
> fallocate -l 85823746048 /mnt/test/abc
>
> which left only 4 blocks available in the volume finally:
> /dev/loop0      20961280 20961276         4 100% /mnt/test
>
> The result of xfs_bmap against /mnt/test/abc:
> /mnt/test/abc:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET              TOTAL FLAGS
>    0: [0..167624503]:  83000..167707503  0 (83000..167707503) 167624504 10000
>
> After that, I used the hole-punching utility above to create holes on
> the files, and captured the output of kmsg.
>
> When reading the log output
> <https://gist.github.com/890076405e1c13c0a952a579e25e6afe> , I
> realised that there is no B+Tree split
> triggered by xfs_alloc_fix_freelist() when calling xfs_free_extent().
> Isn't B+Tree split possible in by-size B+Tree even when truncating a
> longer freespace record to shorter one? But what I found in the log is
> only a few tree shrinks... And when reading the source code of
> freespace allocator I found that a B+Tree growth in this case is
> impossible at least...

<Prev in Thread] Current Thread [Next in Thread>