xfs
[Top] [All Lists]

Re: Request for information on bloated writes using Swift

To: Dave Chinner <david@xxxxxxxxxxxxx>, Dilip Simha <nmdilipsimha@xxxxxxxxx>
Subject: Re: Request for information on bloated writes using Swift
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Wed, 3 Feb 2016 09:02:40 -0600
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20160203083016.GD459@dastard>
References: <CAFHL4X1JU02LFYntkqhYg1N++ZU46ML3v5higo1nRsPyoZxL5A@xxxxxxxxxxxxxx> <56B16A3C.1030207@xxxxxxxxxxx> <CAFHL4X0QBtFpz3=HMVMrp6NoaW5BRkDSoTE1yJQvQ=0JrW5+YQ@xxxxxxxxxxxxxx> <20160203063705.GB459@dastard> <CAFHL4X0m8Ov+zJxteUJJxzEHVXpJsfe=9mtapRmWkhT6VRkDxg@xxxxxxxxxxxxxx> <20160203083016.GD459@dastard>
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

On 2/3/16 2:30 AM, Dave Chinner wrote:
> On Tue, Feb 02, 2016 at 11:09:15PM -0800, Dilip Simha wrote:
>> Hi Dave,
>>
>> On Tue, Feb 2, 2016 at 10:37 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>>
>>> On Tue, Feb 02, 2016 at 07:40:34PM -0800, Dilip Simha wrote:
>>>> Hi Eric,
>>>>
>>>> Thank you for your quick reply.
>>>>
>>>> Using xfs_io as per your suggestion, I am able to reproduce the issue.
>>>> However, I need to falloc for 256K and write for 257K to see this issue.
>>>>
>>>> # xfs_io -f -c "falloc 0 256k" -c "pwrite 0 257k" /srv/node/r1/t1.txt
>>>> # stat /srv/node/r1/t4.txt | grep Blocks
>>>>   Size: 263168     Blocks: 1536       IO Block: 4096   regular file
>>>
>>> Fallocate sets the XFS_DIFLAG_PREALLOC on the inode.
>>>
>>> When you writing *past the preallocated area* and do delayed
>>> allocation, the speculative preallocation beyond EOF is double the
>>> size of the extent at EOF. i.e. 512k, leading to 768k being
>>> allocated to the file (1536 blocks, exactly).
>>>
>>
>> Thank you for the details.
>> This is exactly where I am a bit perplexed. Since the reclamation logic
>> skips inodes that have the XFS_DIFLAG_PREALLOC flag set, why did the
>> allocation logic allot more blocks on such an inode?
> 
> To store the data you wrote outside the preallocated region, of
> course.

I think what Dilip meant was, why does it do preallocation, not
why does it allocate blocks for the data.  That part is obvious
of course.  ;)

IOWS, if XFS_DIFLAG_PREALLOC prevents speculative preallocation
from being reclaimed, why is speculative preallocation added to files
with that flag set?

Seems like a fair question, even if Swift's use of preallocation is
ill-advised.

I don't have all the speculative preallocation heuristics in my
head like you do Dave, but if I have it right, and it's i.e.:

1) preallocate 256k
2) inode gets XFS_DIFLAG_PREALLOC
3) write 257k
4) inode gets speculative preallocation added due to write past EOF
5) inode never gets preallocation trimmed due to XFS_DIFLAG_PREALLOC

that seems suboptimal.

Never doing speculative preallocation on files with XFS_DIFLAG_PREALLOC
set, regardless of file offset, would seem sane to me.  App asked
to take control via prealloc; let it have it, and leave it at that.

(Of course now I'll go read the code to see if I understand it
properly...)

-Eric

<Prev in Thread] Current Thread [Next in Thread>