[Top] [All Lists]

Re: [PATCH 1/2] dio: track and serialise unaligned direct IO

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 1/2] dio: track and serialise unaligned direct IO
From: Jan Kara <jack@xxxxxxx>
Date: Mon, 2 Aug 2010 20:50:53 +0200
Cc: linux-fsdevel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, sandeen@xxxxxxxxxxx
In-reply-to: <1280733945-16231-2-git-send-email-david@xxxxxxxxxxxxx>
References: <1280733945-16231-1-git-send-email-david@xxxxxxxxxxxxx> <1280733945-16231-2-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Mon 02-08-10 17:25:44, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> If we get two unaligned direct IO's to the same filesystem block
> that is marked as a new allocation (i.e. buffer_new), then both IOs
> will zero the portion of the block they are not writing data to. As
> a result, when the IOs complete there will be a portion of the block
> that contains zeros from the last IO to complete rather than the
> data that should be there.
> This is easily manifested by qemu using aio+dio with an unaligned
> guest filesystem - every IO is unaligned and fileystem corruption is
> encountered in the guest filesystem. xfstest 240 (from Eric Sandeen)
> is also a simple reproducer.
> To avoid this problem, track unaligned IO that triggers sub-block
> zeroing and check new incoming unaligned IO that require sub-block
> zeroing against that list. If we get an overlap where the start and
> end of unaligned IOs hit the same filesystem block, then we need to
> block the incoming IOs until the IO that is zeroing the block
> completes. The blocked IO can then continue without needing to do
> any zeroing and hence won't overwrite valid data with zeros.
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> +/*
> + * Add a filesystem block to the list of blocks we are tracking.
> + */
> +static void
> +dio_start_zero_block(struct dio *dio, sector_t zero_block)
> +{
> +     struct dio_zero_block *zb;
> +
> +     zb = kmalloc(sizeof(*zb), GFP_NOIO);
> +     if (!zb)
> +             return;
  Ho hum, so if the allocation fails, we will just silently corrupt the
data anyway? Not good I think.

Jan Kara <jack@xxxxxxx>

<Prev in Thread] Current Thread [Next in Thread>