xfs
[Top] [All Lists]

Re: [dm-devel] [BUG] pvmove corrupting XFS filesystems (was Re: [BUG] In

To: Matteo Frigo <athena@xxxxxxxx>
Subject: Re: [dm-devel] [BUG] pvmove corrupting XFS filesystems (was Re: [BUG] Internal error xfs_dir2_data_reada_verify)
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 8 Mar 2013 12:57:23 +1100
Cc: dm-devel@xxxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <87hakmpxac.fsf@xxxxxxxx>
References: <87d2vnc34r.fsf@xxxxxxxx> <20130226044039.GM5551@dastard> <20130227010414.GD1514@xxxxxxxxxxxxxxxxxx> <20130227014900.GY5551@dastard> <87y5eah4xz.fsf@xxxxxxxx> <87k3pjs908.fsf@xxxxxxxx> <20130307223140.GU23616@dastard> <87hakmpxac.fsf@xxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu, Mar 07, 2013 at 07:09:31PM -0500, Matteo Frigo wrote:
> Dave Chinner <david@xxxxxxxxxxxxx> writes:
> 
> > You need the XFS patch I posted so that readahead buffer
> > verification is avoided in the case of an error being returned from
> > the readahead.
> 
> I apologize if I was not clear in my previous post.  I mean to say that
> returning -EIO from dm, even in conjunction with your patch, is not
> sufficient to fix the problem.
> 
> Specifically, I repeated the experiment with v3.8.2 patched as discussed
> below, running my original script (repeated here for completeness):
> 
>    pvcreate /dev/vd[bc]
>    vgcreate test /dev/vd[bc]
>    lvcreate -L 8G -n vol test /dev/vdb
>    mkfs.xfs -f /dev/mapper/test-vol
>    mount -o noatime /dev/mapper/test-vol /mnt
>    cd /mnt
>    git clone ~/linux-stable
>    cd /
>    umount /mnt
> 
>    mount -o noatime /dev/mapper/test-vol /mnt
>    pvmove -b /dev/vdb /dev/vdc
>    sleep 2
>    rm -rf /mnt/linux-stable
> 
> I obtained a string of errors that starts with this:
> 
>   [  166.596574] XFS (dm-1): metadata I/O error: block 0x805060 
> ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.599556] XFS (dm-1): metadata I/O error: block 0x805060 
> ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.604845] XFS (dm-1): metadata I/O error: block 0x5285b8 
> ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.607894] XFS (dm-1): metadata I/O error: block 0x5285b8 
> ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.614242] XFS (dm-1): metadata I/O error: block 0x54f2b0 
> ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.617307] XFS (dm-1): metadata I/O error: block 0x54f2b0 
> ("xfs_trans_read_buf_map") error 5 numblks 8
>   [  166.651373] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.653517] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.655545] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.657614] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.659685] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.661731] XFS (dm-1): Corruption detected. Unmount and run xfs_repair
>   [  166.663761] XFS (dm-1): Corruption detected. Unmount and run xfs_repair

Add the the patch below. If you still see errors, then they are real
IO errors from the block device.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

xfs: ensure we capture IO errors correctly

From: Dave Chinner <dchinner@xxxxxxxxxx>

Failed buffer readahead can leave the buffer in the cache marked
with an error. Most callers that then issue a subsequent read on the
buffer do not zero the b_error field out, and so we may incorectly
detect an error during IO completion due to the stale error value
left on the buffer.

Avoid this problem by zeroing the error before IO submission. This
ensures that the only IO errors that are detected those captured
from are those captured from bio submission or completion.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
 fs/xfs/xfs_buf.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 50eb603..82b70bd 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1336,6 +1336,12 @@ _xfs_buf_ioapply(
        int             size;
        int             i;
 
+       /*
+        * Make sure we capture only current IO errors rather than stale errors
+        * left over from previous use of the buffer (e.g. failed readahead).
+        */
+       bp->b_error = 0;
+
        if (bp->b_flags & XBF_WRITE) {
                if (bp->b_flags & XBF_SYNCIO)
                        rw = WRITE_SYNC;

<Prev in Thread] Current Thread [Next in Thread>