xfs
[Top] [All Lists]

Re: [PATCH] dio: track and serialise unaligned direct IO

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] dio: track and serialise unaligned direct IO
From: Matthew Wilcox <matthew@xxxxxx>
Date: Thu, 29 Jul 2010 20:53:24 -0600
Cc: linux-fsdevel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, sandeen@xxxxxxxxxxx
In-reply-to: <1280443516-14448-1-git-send-email-david@xxxxxxxxxxxxx>
References: <1280443516-14448-1-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.18 (2008-05-17)
On Fri, Jul 30, 2010 at 08:45:16AM +1000, Dave Chinner wrote:
> If we get two unaligned direct IO's to the same filesystem block
> that is marked as a new allocation (i.e. buffer_new), then both IOs will
> zero the portion of the block they are not writing data to. As a
> result, when the IOs complete there will be a portion of the block
> that contains zeros from the last IO to complete rather than the
> data that should be there.

Urgh.  Yuck.

> This is easily manifested by qemu using aio+dio with an unaligned
> guest filesystem - every IO is unaligned and fileystem corruption is
> encountered in the guest filesystem. xfstest 240 (from Eric Sandeen)
> is also a simple reproducer.
> 
> To avoid this problem, track unaligned IO that triggers sub-block zeroing and
> check new incoming unaligned IO that require sub-block zeroing against that
> list. If we get an overlap where the start and end of unaligned IOs hit the
> same filesystem block, then we need to block the incoming IOs until the IO 
> that
> is zeroing the block completes. The blocked IO can then continue without
> needing to do any zeroing and hence won't overwrite valid data with zeros.

Urgh.  Yuck.

Could we perhaps handle this by making an IO instantiate a page cache
page for partial writes, and forcing that portion of the IO through the
page cache?  The second IO would hit the same page and use the existing
O_DIRECT vs page cache paths.

-- 
Matthew Wilcox                          Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

<Prev in Thread] Current Thread [Next in Thread>