xfs
[Top] [All Lists]

Re: recovering corrupt filesystem after raid failure

To: David Lechner <david@xxxxxxxxxxxxxx>
Subject: Re: recovering corrupt filesystem after raid failure
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 22 Feb 2016 13:24:39 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <56CA6492.7000407@xxxxxxxxxxxxxx>
References: <56CA6492.7000407@xxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sun, Feb 21, 2016 at 07:29:54PM -0600, David Lechner wrote:
> Long story short, I had a dual disk failure in a raid 5. I've managed to
> get the raid back up and salvaged what I could. However, the xfs is
> seriously damaged. I've tried running xfs_repair, but it is failing and
> it recommended to send a message to this mailing list. This is an Ubuntu
> 12.04 machine, so xfs_repair version 3.1.7.

So the first thing to do is get a more recent xfsprogs package and
try that. There's not a lot of point in us looking at problems with
a 4 and half year old package that we've probably already fixed.

> The file system won't mount. Fails with "mount: Structure needs
> cleaning". So I tried xfs_repair. I had to resort to xfs_repair -L
> because the first 500MB or so of the filesystem was wiped out.

Oh, so even if you can repair the filesystem, your data is likely to
be irretreivably corrupted.

> Now,
> xfs_repair /dev/md127 gets stuck, so I am running xfs_repair -P
> /dev/md127. This gets much farther, but it is failing too. It gives an
> error message like this:
> 
> 
> ...
> disconnected inode 2101958, moving to lost+found
> corrupt dinode 2101958, extent total = 1, nblocks = 0.  This is a bug.
> Please capture the filesystem metadata with xfs_metadump and
> report it to xfs@xxxxxxxxxxxx
> cache_node_purge: refcount was 1, not zero (node=0x7f2c57e1b120)
> 
> fatal error -- 117 - couldn't iget disconnected inode
> 
> 
> 
> However, nblocks = 0 does not seem to be true...

Probably because it got cleared in memory before this problem was
tripped over.

> If I re-run xfs_repair -P /dev/md127, it will fail on different
> seemingly random inode with the same error message.

Yup, you definitely need to run a current xfs_repair on this
filesystem before going any further.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>