xfs
[Top] [All Lists]

Re: [PATCH] xfs: fix double free of trans in log recovery on I/O error

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs: fix double free of trans in log recovery on I/O error
From: Alex Lyakas <alex@xxxxxxxxxxxxxxxxx>
Date: Tue, 16 Sep 2014 19:01:25 +0300
Cc: Brian Foster <bfoster@xxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140902220216.GJ20518@dastard>
References: <1408648692-15957-1-git-send-email-bfoster@xxxxxxxxxx> <CAOcd+r32+6gafv5+MCNyNTH1Ncas9CJHHrku+Y=sZefDzzS1Zw@xxxxxxxxxxxxxx> <20140825142025.GA10135@xxxxxxxxxxxxxxx> <AFED541EB7FA490C8E16DCB81F400B1C@alyakaslap> <20140831210507.GA11913@xxxxxxxxxxxxxxx> <3476A2CBDE694DC6BD06DBDD15165151@alyakaslap> <20140902220216.GJ20518@dastard>
Hi Dave,

On Wed, Sep 3, 2014 at 1:02 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
On Tue, Sep 02, 2014 at 12:51:35PM +0300, Alex Lyakas wrote:
> Hi Brian, Dave,
> I tested this patch on 3.8.13 kernel with the scenario I described
> in http://oss.sgi.com/pipermail/xfs/2014-August/037637.html, but I
> still see the issue.
> I placed the metadump at https://drive.google.com/file/d/0ByBy89zr3kJNV2UxMERNTkE4aHM/edit?usp=sharing
>
> During log recovery, 3 IO errors are encountered:
> [Â 340.381199] XFS (dm-0): Mounting Filesystem
> [Â 340.439897] XFS (dm-0): Sleep 10s before xlog_do_recover
> [Â 350.440143] XFS (dm-0): Starting recovery (logdev: internal)
> [Â 351.584647] XFS (dm-0): metadata I/O error: block 0x1
> ("xlog_recover_iodone") error 28 numblks 1
> [Â 351.584660] XFS (dm-0): metadata I/O error: block 0x40
> ("xlog_recover_iodone") error 28 numblks 16
> [Â 351.584665] XFS (dm-0): xfs_do_force_shutdown(0x1) called from
> line 377 of file
> /mnt/work/alex/zadara-btrfs/fs/xfs/xfs_log_recover.c. Return
> address = 0xffffffffa0372728
> [Â 351.584969] XFS (dm-0): I/O Error Detected. Shutting down filesystem
> [Â 351.584970] XFS (dm-0): Please umount the filesystem and rectify
> the problem(s)
> [Â 351.585047] XFS (dm-0): metadata I/O error: block 0x1e00040
> ("xlog_recover_iodone") error 28 numblks 16
> [Â 351.585050] XFS (dm-0): xfs_do_force_shutdown(0x1) called from
> line 377 of file
> /mnt/work/alex/zadara-btrfs/fs/xfs/xfs_log_recover.c. Return
> address = 0xffffffffa0372728
> [Â 351.585068] XFS (dm-0): log mount/recovery failed: error 28
> [Â 351.585332] XFS (dm-0): log mount failed
>
> Two IO error callbacks are handled before XFS is unmounted, but the
> last one crashes with stack[1].
>
> Do I need some or all of the 9 patches that Dave posted? (They do
> not apply to my kernel, so I need to apply them by hand).

No, I suspect that there are other problems that have been fixed
since 3.8 that you are missing. e.g.

9c23ecc xfs: unmount does not wait for shutdown during unmount
I applied this patch, and on top of that applied your patch "[PATCH 1/9] xfs: synchronous buffer IO needs a reference". However, the log recovery problem still reproduces.

At least with the 9c23ecc patch, the unmount-while-IO-error problems that I reported long ago seem to be fixed.

Thanks,
Alex.

Â

THere's bound to be others, so you're really going to need to look
at the differences between 3.8 and a current mainline to determine
what other patches you are going to need...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>