xfs
[Top] [All Lists]

Re: [PATCH 1/6] fs: add hole punching to fallocate

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 1/6] fs: add hole punching to fallocate
From: "Ted Ts'o" <tytso@xxxxxxx>
Date: Tue, 9 Nov 2010 16:41:47 -0500
Cc: Josef Bacik <josef@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, linux-btrfs@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, joel.becker@xxxxxxxxxx, cmm@xxxxxxxxxx, cluster-devel@xxxxxxxxxx
In-reply-to: <20101109044242.GH2715@dastard>
Mail-followup-to: Ted Ts'o <tytso@xxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, Josef Bacik <josef@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, linux-btrfs@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, joel.becker@xxxxxxxxxx, cmm@xxxxxxxxxx, cluster-devel@xxxxxxxxxx
References: <1289248327-16308-1-git-send-email-josef@xxxxxxxxxx> <20101109011222.GD2715@dastard> <20101109033038.GF3099@xxxxxxxxx> <20101109044242.GH2715@dastard>
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Nov 09, 2010 at 03:42:42PM +1100, Dave Chinner wrote:
> Implementation is up to the filesystem. However, XFS does (b)
> because:
> 
>       1) it was extremely simple to implement (one of the
>          advantages of having an exceedingly complex allocation
>          interface to begin with :P)
>       2) conversion is atomic, fast and reliable
>       3) it is independent of the underlying storage; and
>       4) reads of unwritten extents operate at memory speed,
>          not disk speed.

Yeah, I was thinking that using a device-style TRIM might be better
since future attempts to write to it won't require a separate seek to
modify the extent tree.  But yeah, there are a bunch of advantages of
simply mutating the extent tree.

While we're on the subject of changes to fallocate, what do people
think of FALLOC_FL_EXPOSE_OLD_DATA, which requires either root
privileges or (if capabilities are in use) CAP_DAC_OVERRIDE &&
CAP_MAC_OVERRIDE && CAP_SYS_ADMIN.  This would allow a trusted process
to fallocate blocks with the extent already marked initialized.  I've
had two requests for such functionality for ext4 already.  

(Take for example a trusted cluster filesystem backend that checks the
object checksum before returning any data to the user; and if the
check fails the cluster file system will try to use some other replica
stored on some other server.)

                                                 - Ted

<Prev in Thread] Current Thread [Next in Thread>