On 4/11/2013 1:34 AM, Victor K wrote:
> The raid array did not suffer, at least, not according to mdadm; it is now
> happily recovering the one disk that officially failed, but the whole thing
> assembled without a problem
> There was a similar crash several weeks ago on this same array, but had
> ext4 system back then.
> I was able to save some of the latest stuff, and decided to move to xfs as
> something more reliable.
> I suspect now I should also had replaced the disk controller then.
Rebuilds are *supposed* to be transparent to the filesystem but this is
not always the case. Sometimes due to bugs. In fact we just recently
saw an LVM bug wherein a pvmove operation was not transparent, and hosed
up an XFS. This is but one of many reasons I prefer hardware based RAID
and volume management. It isolates these functions and RAID memory
structures from the kernel, and thus prevents such bugs from causing
problems. This may/not be the source of your apparent XFS corruption.
We don't have enough (log) data to ascertain the cause at this point.
Running repair on an 8/10TB filesystem while md is rebuilding the
underlying RAID6 array isn't something I'd put a lot of trust in. Wait
until the rebuild is finished and then run a non-destructive repair.
Compare the results to the previous repair.
--
Stan
|