xfs
[Top] [All Lists]

Re: [PATCH] xfs: return errors from partial I/O failures to files

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: return errors from partial I/O failures to files
From: David Jeffery <djeffery@xxxxxxxxxx>
Date: Thu, 27 Aug 2015 09:41:52 -0400
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20150826221951.GE3902@dastard>
References: <20150826190636.GA16540@xxxxxxxxxxxxxxx> <20150826221951.GE3902@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
On 08/26/2015 06:19 PM, Dave Chinner wrote:
> On Wed, Aug 26, 2015 at 03:06:36PM -0400, David Jeffery wrote:
>> There is an issue with xfs's error reporting in some cases of I/O partially
>> failing and partially succeeding. Calls like fsync() can report success even
>> though not all I/O was successful.
> 
> Hi David,
> 
> I read your bug report last night and after considering all the work
> you put into it, I was going to ask if you wanted to finish off the
> job by writing the patch to fix it. But you beat me to it.
> 
> Nice work! :)
> 
>> The issue can occur when there are multiple bio per xfs_ioend struct.
>> Each call to xfs_end_bio() for a bio completing will write a value to
>> ioend->io_error.  If a successful bio completes after any failed bio, no
>> error is reported do to it writing 0 over the error code set by any failed 
>> bio.
>> The I/O error information is now lost and when the ioend is completed
>> only success is reported back up the filesystem stack.
> 
> It's worth mentioning the case that this was seen in - a single
> failed disk in a raid 0 stripe, and the error from the bio to the
> failed disk was overwritten by the successes from the bios to the
> other disks.
> 
> FWIW, I think that we also need to create an xfstest for this case,
> too, because it's clear that this is a big hole in our test coverage
> (i.e.  partial block device failure). It might be best to talk to
> Eryu (cc'd) to get your reproducer converted into a xfstest case
> that we can then test all filesystems against?
> 
>> xfs_end_bio() should only set ioend->io_error in the case of BIO_UPTODATE
>> being clear.  ioend->io_error is initialized to 0 at allocation so only needs
>> to be updated by any failed bio structs. This ensures an error can be 
>> reported
>> to the application.
>>
>> Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx>
>> ---
> 
> Best to add a "cc: <stable@xxxxxxxxxxxxxxx>" so that it gets pushed
> back to all the stable kernels automatically which it hits Linus'
> tree.
> 
> One minor change to the fix:
> 
>> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
>> index 3859f5e..b82b128 100644
>> --- a/fs/xfs/xfs_aops.c
>> +++ b/fs/xfs/xfs_aops.c
>> @@ -356,7 +356,8 @@ xfs_end_bio(
>>  {
>>      xfs_ioend_t             *ioend = bio->bi_private;
>>  
>> -    ioend->io_error = test_bit(BIO_UPTODATE, &bio->bi_flags) ? 0 : error;
>> +    if (!test_bit(BIO_UPTODATE, &bio->bi_flags))
>> +            ioend->io_error = error;
> 
> We should preserve the original error that was reported, rather than
> report the last one. ioend->io_error is always initialised to zero,
> so we can simply do:

Is there a particular reason to prefer the first error?  Is it just
standard practice?  I originally made a version which preserved the
first error but couldn't think of a reason why the first or last would
be best.  So I went with the patch which has the simpler if statement.

> 
>       if (!ioend->io_error && !test_bit(BIO_UPTODATE, &bio->bi_flags))
>               ioend->io_error = error;
> 
> Can you update the patch and resend it? I've got a couple of other
> fixes that I need to push to the for-next tree in the next couple of
> days (i.e. before the 4.3. merge window opens) and I'd like to get
> this one in as well.
> 
> Cheers,
> 
> Dave
> 

I'll get a new version sent.

David Jeffery

<Prev in Thread] Current Thread [Next in Thread>