xfs
[Top] [All Lists]

Re: Strange hole creation behavior

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: Strange hole creation behavior
From: PÃdraig Brady <P@xxxxxxxxxxxxxx>
Date: Fri, 11 Apr 2014 23:58:11 +0100
Cc: xfs-oss <xfs@xxxxxxxxxxx>, OndÅej VaÅÃk <ovasik@xxxxxxxxxx>, Coreutils <coreutils@xxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140411204338.GA38024@xxxxxxxxxxxxxxx>
References: <534822D7.7090803@xxxxxxxxxxxxxx> <20140411204338.GA38024@xxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2
On 04/11/2014 09:43 PM, Brian Foster wrote:
> On Fri, Apr 11, 2014 at 06:13:59PM +0100, PÃdraig Brady wrote:
>> So this coreutils test is failing on XFS:
>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=tests/dd/sparse.sh;h=06efc7017
>> Specifically the last hole check on line 66.
>>
>> In summary what's happening is that a write(1MiB), lseek(1MiB), write(1MiB)
>> creates only a 64KiB hole. Is that expected?
>>
> 
> This is expected behavior due to speculative preallocation. An FAQ with
> regard to this behavior is pending, but see here for reference:
> 
> http://oss.sgi.com/archives/xfs/2014-04/msg00083.html
> 
> In that particular write(1MB), lseek(+1MB), write(1MB) workload, each
> write is preallocating some extra space beyond the current EOF. The seek
> then moves past that space, but the space doesn't go away. The
> subsequent writes will extend EOF. The previously preallocated space now
> resides in the middle of the file and can't be trimmed away when the
> file is closed.
> 
>> Now a 1MiB hole is supported using truncate:
>>   dd if=/dev/urandom of=file.in bs=1M count=1 iflag=fullblock
>>   truncate -s+1M file.in
>>   dd if=/dev/urandom of=file.in bs=1M count=1 iflag=fullblock conv=notrunc 
>> oflag=append
>>   $ du -k file.in
>>   2048  file.in
>>
> 
> This works simply because it is broken into multiple commands. When the
> first dd exits, the excess space is trimmed off (the file descriptor is
> closed). The subsequent truncate extends the file size without any
> extra space getting caught between the old and new EOF.
> 
> You can confirm this by using the 'allocsize=4k' mount option to the XFS
> mount. If you wanted something more generic for the purpose of testing
> the coreutils functionality, you could also set the size of file.out in
> advance. E.g., with preallocation in effect:
> 
> # dd if=file.in of=file.out bs=1M conv=sparse
> # xfs_bmap -v file.out 
> file.out:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>    0: [0..3967]:       9773944..9777911  1 (9080..13047)     3968
>    1: [3968..4095]:    hole                                   128
>    2: [4096..6143]:    9778040..9780087  1 (13176..15223)    2048
> 
> ... and then prevent preallocation by ensuring writes do not extend the
> file:
> 
> # rm -f file.out 
> # truncate --size=3M file.out
> # dd if=file.in of=file.out bs=1M conv=sparse,notrunc
> # xfs_bmap -v file.out 
> file.out:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>    0: [0..2047]:       9773944..9775991  1 (9080..11127)     2048
>    1: [2048..4095]:    hole                                  2048
>    2: [4096..6143]:    9778040..9780087  1 (13176..15223)    2048
> 
> Hope that helps.

Excellent info thanks.
With that I can adjust the test so it passes (patch attached).

So for reference this means that cp can no longer recreate holes
<= 1MiB from source to dest (with the default XFS allocation size):

$ cp --sparse=always file.in cp.out
$ xfs_bmap -v !$
xfs_bmap -v cp.out
cp.out:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..3967]:       219104..223071    0 (219104..223071)  3968
   1: [3968..4095]:    hole                                   128
   2: [4096..6143]:    225720..227767    0 (225720..227767)  2048

$ xfs_bmap -v file.out
file.out:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..2047]:       229816..231863    0 (229816..231863)  2048
   1: [2048..4095]:    hole                                  2048
   2: [4096..6143]:    233912..235959    0 (233912..235959)  2048

$ cp file.out cp.out
$ xfs_bmap -v cp.out
cp.out:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..3967]:       250296..254263    0 (250296..254263)  3968
   1: [3968..4095]:    hole                                   128
   2: [4096..6143]:    254392..256439    0 (254392..256439)  2048

Though if we bump up the hole size the representation is better:

$ dd if=/dev/urandom of=bigfile.in bs=1M count=1 iflag=fullblock
$ truncate -s+10M bigfile.in
$ dd if=/dev/urandom of=bigfile.in bs=1M count=1 iflag=fullblock conv=notrunc 
oflag=append

$ xfs_bmap -v bigfile.in
bigfile.in:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..2047]:       231864..233911    0 (231864..233911)  2048
   1: [2048..22527]:   hole                                 20480
   2: [22528..24575]:  256440..258487    0 (256440..258487)  2048

$ cp bigfile.in bigfile.out
$ xfs_bmap -v bigfile.out
bigfile.out:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..3967]:       260408..264375    0 (260408..264375)  3968
   1: [3968..22527]:   hole                                 18560
   2: [22528..24575]:  264376..266423    0 (264376..266423)  2048

We could I suppose use FALLOC_FL_PUNCH_HOLE where available
to cater for this case. I'll see whether this is worth adding.
That can be used after the fact anyway:

$ fallocate --dig-holes bigfile.out
$ xfs_bmap -v bigfile.out
bigfile.out:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
   0: [0..2047]:       260408..262455    0 (260408..262455)  2048
   1: [2048..22527]:   hole                                 20480
   2: [22528..24575]:  264376..266423    0 (264376..266423)  2048

thanks,
PÃdraig.

Attachment: dd-sparse-xfs.patch
Description: Text Data

<Prev in Thread] Current Thread [Next in Thread>