xfs
[Top] [All Lists]

Re: vmsplice can't work well

To: David Chinner <dgc@xxxxxxx>
Subject: Re: vmsplice can't work well
From: Jens Axboe <axboe@xxxxxxxxx>
Date: Fri, 1 Sep 2006 15:45:12 +0200
Cc: "Jeffrey E. Hundstad" <jeffrey.hundstad@xxxxxxxx>, xfs@xxxxxxxxxxx, nathans@xxxxxxx
In-reply-to: <20060901131913.GG5737019@xxxxxxxxxxxxxxxxx>
References: <44F4440F.1090300@xxxxxxxxx> <20060829140542.GN12257@xxxxxxxxx> <44F5CC08.8010205@xxxxxxxx> <20060830174815.GF7331@xxxxxxxxx> <44F5D3C6.1010108@xxxxxxxx> <20060831092440.GC5528@xxxxxxxxx> <20060901131913.GG5737019@xxxxxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
On Fri, Sep 01 2006, David Chinner wrote:
> On Thu, Aug 31, 2006 at 11:24:41AM +0200, Jens Axboe wrote:
> > XFS list,
> > 
> > On Wed, Aug 30 2006, Jeffrey E. Hundstad wrote:
> > > Jens Axboe wrote:
> > > >On Wed, Aug 30 2006, Jeffrey E. Hundstad wrote:
> > > >  
> > > >>I tried your splie-git...tar.gz file and tried the splice-cp.  It 
> > > >>produced files that are the right length... but the files only contain 
> > > >>nulls.  Here's the straces:
> > > >>    
> > > >
> > > >Works for me as well. Could be an fs issue, how large was the README and
> > > >what filesystem did you use?
> > > >
> > > >  
> > > The file was 1130 bytes (it was the README in that directory.)  The 
> > > filesystem is XFS.
> > > 
> > 
> > I can reproduce this quite easily, doing:
> > 
> > nelson:~ # splice-cp sda.blktrace.0 foo
> > 
> > nelson:~ # md5sum sda.blktrace.0 foo
> > 4754070ae77091468c830ea23b125d68  sda.blktrace.0
> > efdc7b9d00692fdfe91a691277209267  foo
> 
> Busted write side - splice-in works fine, splice-out is an alias
> for /dev/zero. The reason it's full of NULLs:
> 
> death:/mnt# xfs_bmap -vv foo
> foo: no extents
> death:/mnt#
> 
> It's a hole.  Nothing has been flushed out to disk.
> 
> Interesting - the inode is leaving pipe_to_file() dirty, the page is
> dirty, the buffer head is dirty, delay, mapped and uptodate. The
> page is the only page in the radix tree and the radix tree is marked
> dirty.
> 
> But it never gets flushed out. Even when I use dd to seek past the
> first disk block and write further into the file, I still end up
> with a hole in the range where the original splice write should
> be which means it was no longer in the page cache.
> 
> Copying a large file I can see dirty memory increase to tens of
> megabytes.  Nothing is going to disk, writeback is not going above
> zero.  Interestingly, when the write completes, the size of the page
> cache drops by almost exactly the size of the file being written -
> almost like a truncate_inode_pages() is occuring on file close.
> 
> Oh, look - we _are_ tossing away all the pages on close.
> 
> xfs_splice_write() hasn't updated the xfs inode size when extending the
> file. The linux inode  has the correct value, but xfs thinks that it's
> only got a speculative allocation EOF (i.e. 0) so we invalidate it
> before it gets to disk.
> 
> The patch below just copies some code out of xfs_write() where it
> updates the xfs inode size and drops it in xfs_splice_write(). It's
> almost certainly not the right fix, but the bucket under the pipe will
> now catch most of the bits....

Good analysis and fix, Dave! I don't have time to test it right now,
perhaps Jeffrey can give it a shot? Will you make sure this gets into
2.6.18?

-- 
Jens Axboe


<Prev in Thread] Current Thread [Next in Thread>