[Top] [All Lists]

Re: avoid mbox file fragmentation

To: Linux XFS <xfs@xxxxxxxxxxx>
Subject: Re: avoid mbox file fragmentation
From: pg@xxxxxxxxxxxxxxxxxx (Peter Grandi)
Date: Wed, 20 Oct 2010 12:50:45 +0100
In-reply-to: <20101019234217.GD12506@dastard>
References: <4CBE2403.8070108@xxxxxxxxxxxxxxxxx> <20101019234217.GD12506@dastard>
[ ...  multiple slowly growing streams case ... ]

> What you want is _physical_ preallocation, not speculative
> preallocation.

Not even that. What he wants really is applications based on a
mail database layer that handles these issues, like real DBMSes do
(or *should* do), with tablespaces and the like, including where
necessary reservations and the like as you say:

> i.e. look up XFS_IOC_RESVSP or FIEMAP so your application does
> _permanent_ preallocate past EOF. Alteratively, the filesystem
> will avoid the truncation on close() is the file has the APPEND
> attribute set and the application is writing via O_APPEND...

Because these issues are common to essentially all multiple slow
growing files. I have seen slowly downloading ISO images with
hundreds of thousands of extents, never mind log files.

> The filesystem cannot do everything for you. Sometimes the
> application has to help....

That's only because file system authors are lazy and haven't
implemented 'O_PONIES' yet. :-)

However, I am a fan of having *default* physical preallocation,
because as a rule one can trade off space for speed nowadays, and
padding files with "future growth" tails is fairly cheap, and one
could modify the filesystem code or 'fsck' to reclaim unused space
in "future grown" tails.

Ideally applications would advise the filesystem with the expected
allocation patterns (e.g. "immutable", or "loglike", or "rewritable")
but the 'O_PONIES' discussions shows that it is unrealistic.
Userspace sucks.

Even if I think that a lot of good could be done by autoadvising in
the 'stdio' library, as 'stdio' flags often give a pretty huge hint.
Uhm I just remembered that I wriote in my blog something fairly
related, with a nice if sadly still true quote from a crucial paper
on all this userspace (and sometimes kernel) suckage:


<Prev in Thread] Current Thread [Next in Thread>