xfs
[Top] [All Lists]

Re: seeking advice on sparse files on xfs

To: Joe Landman <joe.landman@xxxxxxxxx>
Subject: Re: seeking advice on sparse files on xfs
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 20 Dec 2012 09:54:06 +1100
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
In-reply-to: <50D23D09.3080708@xxxxxxxxx>
References: <50D23D09.3080708@xxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Dec 19, 2012 at 05:17:45PM -0500, Joe Landman wrote:
> Hi folks:
> 
>   I've been using sparse files on xfs for a while, and every now and
> then have the pleasure of copying them.  Using coreutils cp
> (standard linux cp) usually winds up with the utility seeking
> through the entire file (yes, even with with the --sparse=* options
> set).
> 
>   It seems to me that the code is blissfully unware of the file
> extents, and its sparse implementation amounts to a read, a check to
> see if it needs to write it as a hole, write it, and the next seek.
> Iterate until done.
> 
>   Here's my question.  Is there a way to (easily) programmatically
> hand cp (or any other utility) something akin to the output of
> xfs_bmap, and thus save it potentially *gargantuan* amounts of
> seeking over known zero regions?  File sparsity in these cases are
> from 80-99% in some cases (fills of 1-20%) for multi GB/TB sized
> files.
> 
>   Pointers appreciated.  I am looking at the copy routines in
> coreutils now, looking to see if we can increase its intelligence
> somewhat w.r.t. sparse files.

Here's a good overview of the state of play:

http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/08/sparse-improvements-LPC-2012.pdf

And what you really want is a version of cp that supports these:

$ man lseek
....
   Seeking file data and holes
       Since version 3.1, Linux supports the following additional
       values for whence:

       SEEK_DATA
              Adjust  the  file  offset to the next location in the
              file greater than or equal to offset containing data.
              If offset points to data, then the file offset is set
              to offset.

       SEEK_HOLE
              Adjust the file offset to the next hole in the file
              greater than or equal to offset.  If offset points
              into the middle of a hole, then the file offset  is
              set to offset.  If there is no hole past offset, then
              the file offset is adjusted to the end of the file
              (i.e., there is an implicit hole at the end of any
              file).
.....

I'm pretty sure coreutils support is in the pipeline right now....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>