xfs
[Top] [All Lists]

Re: spurious -ENOSPC on XFS

To: Lachlan McIlroy <lachlan@xxxxxxx>
Subject: Re: spurious -ENOSPC on XFS
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 15 Jan 2009 09:16:55 +1100
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>, Mikulas Patocka <mpatocka@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
In-reply-to: <496C2D69.2010301@xxxxxxx>
Mail-followup-to: Lachlan McIlroy <lachlan@xxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, Mikulas Patocka <mpatocka@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx
References: <Pine.LNX.4.64.0901120509550.11089@xxxxxxxxxxxxxxxxxxxxxxxxxxx> <20090112151133.GA24852@xxxxxxxxxxxxx> <496C2D69.2010301@xxxxxxx>
User-agent: Mutt/1.5.18 (2008-05-17)
On Tue, Jan 13, 2009 at 04:58:01PM +1100, Lachlan McIlroy wrote:
> Christoph Hellwig wrote:
>> On Mon, Jan 12, 2009 at 06:14:36AM -0500, Mikulas Patocka wrote:
>>> Hi
>>>
>>> I discovered a bug in XFS in delayed allocation.
>>>
>>> When you take a small partition (52MB in my case) and copy many small 
>>> files on it (source code) that barely fits there, you get -ENOSPC. 
>>> Then sync the partition, some free space pops up, click "retry" in MC 
>>> an the copy continues. They you get again -ENOSPC, you must sync, 
>>> click "retry" and go on. And so on few times until the source code 
>>> finally fits on the XFS partition.
>>>
>>> This misbehavior is apparently caused by delayed allocation, delayed  
>>> allocation does not exactly know how much space will be occupied by 
>>> data, so it makes some upper bound guess. Because free space count is 
>>> only a guess, not the actual data being consumed, XFS should not 
>>> return -ENOSPC on behalf of it. When the free space overflows, XFS 
>>> should sync itself, retry allocation and only return -ENOSPC if it 
>>> fails the second time, after the sync.
> This sounds like a problem with speculative allocation - delayed allocations
> beyond eof.  Even if we write a small file, say 4k, a 64k chunk of delayed
> allocation will be credited to the file. 

The second retry occurs without speculative EOF allocation. That's
what the BMAPI_SYNC flag does....

That being said, it can't truncate away pre-existing speculative
allocations on other files, which is why there is a global flush
and wait before the third retry.....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>