[Top] [All Lists]

Re: [patch 0/9] writeback data integrity and other fixes (take 3)

To: Nick Piggin <npiggin@xxxxxxx>
Subject: Re: [patch 0/9] writeback data integrity and other fixes (take 3)
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 29 Oct 2008 14:16:45 +1100
Cc: akpm@xxxxxxxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, Chris Mason <chris.mason@xxxxxxxxxx>
In-reply-to: <20081029001653.GF15599@xxxxxxxxxxxxx>
Mail-followup-to: Nick Piggin <npiggin@xxxxxxx>, akpm@xxxxxxxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, Chris Mason <chris.mason@xxxxxxxxxx>
References: <20081028144715.683011000@xxxxxxx> <20081028153953.GB3082@xxxxxxxxxxxxx> <20081028222746.GB4985@disturbed> <20081029001653.GF15599@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.18 (2008-05-17)
On Wed, Oct 29, 2008 at 01:16:53AM +0100, Nick Piggin wrote:
> On Wed, Oct 29, 2008 at 09:27:46AM +1100, Dave Chinner wrote:
> > On Tue, Oct 28, 2008 at 04:39:53PM +0100, Nick Piggin wrote:
> > Yes - that's coming from end_buffer_async_write() when an error is
> > reported in bio completion. This does:
> > 
> >  465                 set_bit(AS_EIO, &page->mapping->flags);
> >  466                 set_buffer_write_io_error(bh);
> >  467                 clear_buffer_uptodate(bh);
> >  468                 SetPageError(page);
> > 
> > Hmmmm - do_fsync() calls filemap_fdatawait() which ends up in
> > wait_on_page_writeback_range() which is appears to be checking the
> > mapping flags for errors. I wonder why that error is not being
> > propagated then? AFAICT both XFS and the fsync code are doing the
> > right thing but somewhere the error has gone missing...
> This one-liner has it reporting EIO errors like a champion. I
> don't know if you'll actually need to put this into the
> linux API layer or not, but anyway the root cause of the problem
> AFAIKS is this.
> --
> XFS: fix fsync errors not being propogated back to userspace.
> ---
> Index: linux-2.6/fs/xfs/xfs_vnodeops.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_vnodeops.c
> +++ linux-2.6/fs/xfs/xfs_vnodeops.c
> @@ -715,7 +715,7 @@ xfs_fsync(
>       /* capture size updates in I/O completion before writing the inode. */
>       error = filemap_fdatawait(VFS_I(ip)->i_mapping);
>       if (error)
> -             return XFS_ERROR(error);
> +             return XFS_ERROR(-error);


Yeah, that'd do it. Good catch. I can't believe I recently fixed a
bug that touched these lines of code without noticing the inversion.
Sometimes I wonder if we should just conver the entire of XFS to
return negative errors - mistakes in handling negative error numbers
in the core XFS code happen all the time.

FWIW, the core issue here is that we've got to do the
filemap_fdatawait() call in the ->fsync method because ->fsync
gets called before we've waited for the data I/O to complete.
XFS updates inode state on I/O completion, so we *must* wait
for data I/O to complete before logging the inode changes. I
think btrfs has the same problem....

Thanks again, Nick.


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>