xfs
[Top] [All Lists]

Re: recovering corrupt filesystem after raid failure

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: recovering corrupt filesystem after raid failure
From: David Lechner <david@xxxxxxxxxxxxxx>
Date: Mon, 22 Feb 2016 11:53:26 -0600
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lechnology.com; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:Cc:References:To:Subject; bh=mLAM1/btw3r8VRW2SRJ+E/wqPnPqBIDWUo4TiOL3Rfk=; b=MM0V+YAPvcaHQgPwZJrr/FbTHt pYfa2Xf7LGwDDHnyZc6A76o03yCOqWTfycKf+OR205smusfAVkHKkwvDXQ6zfHufWfD5i4NFKvWVZ E9imE3VCKl2hAw/dmgmfRazYEAwNFzf1JadlkDUyJNdk5aK0IdDK/EG6fAD5LlKyveg64N0eOB91t jS9mjuRamEnxQ7isuRZmRWgtS9+g2+KHCCoLd60AssKLX+qlz6YgxnPicobbfRj42QwYL6gsw9lV/ 6QuKTVDdln5SLUviOnGLuft+yah/O8yWFZkjRAHTQPgxD4M/X+D4jSfsTDdod5ZNiKEevhwQ/qLF8 k3tnNRKA==;
In-reply-to: <20160222022439.GE14668@dastard>
References: <56CA6492.7000407@xxxxxxxxxxxxxx> <20160222022439.GE14668@dastard>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1
On 02/21/2016 08:24 PM, Dave Chinner wrote:
> On Sun, Feb 21, 2016 at 07:29:54PM -0600, David Lechner wrote:
>> Long story short, I had a dual disk failure in a raid 5. I've managed to
>> get the raid back up and salvaged what I could. However, the xfs is
>> seriously damaged. I've tried running xfs_repair, but it is failing and
>> it recommended to send a message to this mailing list. This is an Ubuntu
>> 12.04 machine, so xfs_repair version 3.1.7.
> 
> So the first thing to do is get a more recent xfsprogs package and
> try that. There's not a lot of point in us looking at problems with
> a 4 and half year old package that we've probably already fixed.
> 
>> The file system won't mount. Fails with "mount: Structure needs
>> cleaning". So I tried xfs_repair. I had to resort to xfs_repair -L
>> because the first 500MB or so of the filesystem was wiped out.
> 
> Oh, so even if you can repair the filesystem, your data is likely to
> be irretreivably corrupted.
> 
>> Now,
>> xfs_repair /dev/md127 gets stuck, so I am running xfs_repair -P
>> /dev/md127. This gets much farther, but it is failing too. It gives an
>> error message like this:
>>
>>
>> ...
>> disconnected inode 2101958, moving to lost+found
>> corrupt dinode 2101958, extent total = 1, nblocks = 0.  This is a bug.
>> Please capture the filesystem metadata with xfs_metadump and
>> report it to xfs@xxxxxxxxxxxx
>> cache_node_purge: refcount was 1, not zero (node=0x7f2c57e1b120)
>>
>> fatal error -- 117 - couldn't iget disconnected inode
>>
>>
>>
>> However, nblocks = 0 does not seem to be true...
> 
> Probably because it got cleared in memory before this problem was
> tripped over.
> 
>> If I re-run xfs_repair -P /dev/md127, it will fail on different
>> seemingly random inode with the same error message.
> 
> Yup, you definitely need to run a current xfs_repair on this
> filesystem before going any further.
> 
> Cheers,
> 
> Dave.
> 

Thanks for the advice. The newer version was able to complete
successfully. I can now mount the file system and I ended up with 1.5TB
in lost+found, so at least there is still something there.

<Prev in Thread] Current Thread [Next in Thread>