[Top] [All Lists]

Re: xfs_repair breaks with assertion

To: xfs@xxxxxxxxxxx
Subject: Re: xfs_repair breaks with assertion
From: Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx>
Date: Thu, 11 Apr 2013 04:55:53 -0500
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CAPaMSRCq0f+GqTbRRCXBFUDdtmpBx=VjBaOLpdDytXunL9dfmQ@xxxxxxxxxxxxxx>
References: <CAPaMSRCGSyhmnjrXpFFkEpmKrjsHqLn0kJ1xLGyf-WZosV7mmQ@xxxxxxxxxxxxxx> <20130411062515.GH10481@dastard> <CAPaMSRCq0f+GqTbRRCXBFUDdtmpBx=VjBaOLpdDytXunL9dfmQ@xxxxxxxxxxxxxx>
Reply-to: stan@xxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20130328 Thunderbird/17.0.5
On 4/11/2013 1:34 AM, Victor K wrote:

> The raid array did not suffer, at least, not according to mdadm; it is now
> happily recovering the one disk that officially failed, but the whole thing
> assembled without a problem
> There was a similar crash several weeks ago on this same array, but had
> ext4 system back then.
> I was able to save some of the latest stuff, and decided to move to xfs as
> something more reliable.
> I suspect now I should also had replaced the disk controller then.

Rebuilds are *supposed* to be transparent to the filesystem but this is
not always the case.  Sometimes due to bugs.  In fact we just recently
saw an LVM bug wherein a pvmove operation was not transparent, and hosed
up an XFS.  This is but one of many reasons I prefer hardware based RAID
and volume management.  It isolates these functions and RAID memory
structures from the kernel, and thus prevents such bugs from causing
problems.  This may/not be the source of your apparent XFS corruption.
We don't have enough (log) data to ascertain the cause at this point.

Running repair on an 8/10TB filesystem while md is rebuilding the
underlying RAID6 array isn't something I'd put a lot of trust in.  Wait
until the rebuild is finished and then run a non-destructive repair.
Compare the results to the previous repair.


<Prev in Thread] Current Thread [Next in Thread>