| To: | Dave Chinner <david@xxxxxxxxxxxxx> |
|---|---|
| Subject: | Re: [BUG] ext2/3/4: dio reads stale data when we do some append dio writes |
| From: | Zheng Liu <gnehzuil.liu@xxxxxxxxx> |
| Date: | Tue, 19 Nov 2013 20:20:02 +0800 |
| Cc: | Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx |
| Delivered-to: | xfs@xxxxxxxxxxx |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=R2HRAMa99n+fZUczhiZhxk7XcM8bKcgMRPnhYwemnGQ=; b=s28Af/49Y/92WMoZmr+299DI2SD2zVvzYOqmnT9xMd1+fVzVKS88XguvpJuGj0+Fuw wLz6H6s0V8HAgjNVl+4B/Vlj/moMoTLaB72CG4IQtkp8cYlnCE5EiI1Ktjy/Gkz45wN/ CvGCODkt4aIQb7mMKvhlPtYuXUG+j1y67xYUmzJtcFblfbcHKx7yfEtTeyOpr/g+juWw EgzTOUlU9vwWRE7xLjL79KxbwcULJRzphmtq7C4aD0YzFiiRfHTGbrQI4hDL+1Al3fow 8XTbplo15Dk7PgRYwsfusifQoIRH1r6dMObR8QyGKjeOnU3D0/xvA+ll3aEpIwuuLwsi ccGQ== |
| In-reply-to: | <20131119120112.GN11434@dastard> |
| Mail-followup-to: | Dave Chinner <david@xxxxxxxxxxxxx>, Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx |
| References: | <20131119095302.GA4534@xxxxxxxxx> <20131119102235.GA5010@xxxxxxxxxxxxx> <20131119104508.GA4630@xxxxxxxxx> <20131119110147.GA3323@xxxxxxxxxxxxx> <20131119111947.GA4782@xxxxxxxxx> <20131119111826.GA20485@xxxxxxxxxxxxx> <20131119120112.GN11434@dastard> |
| User-agent: | Mutt/1.5.21 (2010-09-15) |
On Tue, Nov 19, 2013 at 11:01:12PM +1100, Dave Chinner wrote:
> On Tue, Nov 19, 2013 at 03:18:26AM -0800, Christoph Hellwig wrote:
> > On Tue, Nov 19, 2013 at 07:19:47PM +0800, Zheng Liu wrote:
> > > Yes, I know that XFS has a shared/exclusive lock. I guess that is why
> > > it can pass the test. But another question is why xfs fails when we do
> > > some append dio writes with doing buffered read.
> >
> > Can you provide a test case for that issue?
>
> For XFS, appending direct IO writes only hold the IOLOCK exclusive
> for as long as it takes to guarantee that the the region between the
> old EOF and the new EOF is full of zeros before it is demoted. i.e.
> once the region is guaranteed not to expose stale data, the
> exclusive IO lock is demoted to to a shared lock and a buffered read
> is then allowed to proceed concurrently with the DIO write.
>
> Hence even appending writes occur concurrently with buffered reads,
> and if the read overlaps the block at the old EOF then the page
> brought into the page cache will have zeros in it.
>
> FWIW, there's a wonderful comment in generic_file_direct_write()
> that pretty much covers this case:
>
> /*
> * Finally, try again to invalidate clean pages which might have been
> * cached by non-direct readahead, or faulted in by get_user_pages()
> * if the source of the write was an mmap'ed region of the file
> * we're writing. Either one is a pretty crazy thing to do,
> * so we don't support it 100%. If this invalidation
> * fails, tough, the write still worked...
> */
>
> The kernel code simply does not have the exclusion mechanisms to
> make concurrent buffered and direct IO robust. This is one of the
> problems (amongst many) that we've been looking to solve with an VFS
> level IO range lock of some kind....
Thanks for pointing it out.
- Zheng
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: [BUG] ext2/3/4: dio reads stale data when we do some append dio writes, Zheng Liu |
|---|---|
| Next by Date: | Peça-Show sobre Frank Sinatra, Sinatra, Frank |
| Previous by Thread: | Re: [BUG] ext2/3/4: dio reads stale data when we do some append dio writes, Dave Chinner |
| Next by Thread: | Peça-Show sobre Frank Sinatra, Sinatra, Frank |
| Indexes: | [Date] [Thread] [Top] [All Lists] |