fallocate mode flag for "unshare blocks"?
Austin S. Hemmelgarn
ahferroin7 at gmail.com
Thu Mar 31 10:43:09 CDT 2016
On 2016-03-31 11:31, Andreas Dilger wrote:
> On Mar 31, 2016, at 1:55 AM, Christoph Hellwig <hch at infradead.org> wrote:
>>
>> On Wed, Mar 30, 2016 at 05:32:42PM -0700, Liu Bo wrote:
>>> Well, btrfs fallocate doesn't allocate space if it's a shared one
>>> because it thinks the space is already allocated. So a later overwrite
>>> over this shared extent may hit enospc errors.
>>
>> And this makes it an incorrect implementation of posix_fallocate,
>> which glibcs implements using fallocate if available.
>
> It isn't really useful for a COW filesystem to implement fallocate()
> to reserve blocks. Even if it did allocate all of the blocks on the
> initial fallocate() call, when it comes time to overwrite these blocks
> new blocks need to be allocated as the old ones will not be overwritten.
>
> Because of snapshots that could hold references to the old blocks,
> there isn't even the guarantee that the previous fallocated blocks will
> be released in a reasonable time to free up an equal amount of space.
That really depends on how it's done. AFAIK, unwritten extents on BTRFS
are block reservations which make sure that you can write there (IOW,
the unwritten extent gets converted to a regular extent in-place, not
via COW). This means that it is possible to guarantee that the first
write to that area will work, which is technically all that POSIX
requires. This in turn means that stuff like SystemD and RDBMS software
don't exactly see things working as they expect them too, but that's
because they make assumptions based on existing technology.
More information about the xfs
mailing list