[Top] [All Lists]

Re: fallocate everywhere?

To: karn@xxxxxxxx
Subject: Re: fallocate everywhere?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Sat, 15 Jan 2011 15:38:56 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4D309390.4060009@xxxxxxxxxxxx>
References: <4D309390.4060009@xxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Fri, Jan 14, 2011 at 10:18:56AM -0800, Phil Karn wrote:
> Can anyone think of a good reason *not* to sprinkle fallocate() calls
> through as many Linux utilities as possible? E.g., programs like rsync,
> tar, cpio, pax, ftp, mv, cp -- anything and everything that creates a
> file with a size known in advance.

Yes. It defeats one of the principle optimisations delayed
allocation provides: allocation of contiguous ranges of disk across
multiple files at writeout time.

If you fallocate, you might allocate like this:

          file A     file B     file C     file D    file E

But if writeback order is different and the block device queues are
congested, you might end up with IO patterns like:

    |   +----------+
    |                        +---------+
    |                                             +----------+
    |              +---------+
    V                                  +----------+
          file A     file B     file C     file D    file E

That is, there's the potential for 5 separate IOs to write back the
data because they may not have an adjacent IO to merge with. If the
queue is congested, this this will simply make things worse.

If the same situation occurs with delayed allocation, you end up

          file A     file C     file E     file B    file D

as the allocation pattern, and the block layer would merge the
IOs into one large IO....

> As far as I can tell, calling fallocate() when it's not supported
> quickly returns an error and does no harm. So I can't even think of a
> reason to only make it optional.

I can. It fails the "optimise allocation for optimal writeback
patterns" test. In most cases preallocation is not necessary because
filesystems do a good job of this. The filesystems that don't do a
good job of this don't implement fallocate() anyway, so adding
fallocate to the userspace tools doesn't really help the filesystems
that need help to begin with...

> If it's implemented as an
> off-by-default option, most people would probably not know about it so
> it would rarely get used. Those who do know about it would frequently
> forget to use it, and choosing and learning a separate option for every
> command would be painful.
> Besides xfs, ext4 supports fallocate so I expect that most Linux systems
> will be able to benefit from it fairly soon.

Proper fallocate support requires a method for recording on disk
whether the block/extent is initialised or not. Most filesystems
don't have this, nor will they implement it, so they won't grow
fallocate support. And like I said, filesytems like ext4, xfs and
btrfs don't need fallocate help for stuff like tar/rsync etc.

If you really want to make stuff like cp/tar/ftp/find better, maybe
addressing the bigger problems that limit their throughput - they
are single threaded and aren't very smart.  Updating them to be
multithreaded, use async IO, large buffers, use sparse file/hole
detection by default, use sendfile/splice tricks for zero-copy IO,
etc would be, IMO, more useful for the long term.....


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>