[Top] [All Lists]

Re: XFS corruption on 3ware RAID6-volume

To: Emmanuel Florac <eflorac@xxxxxxxxxxxxxx>
Subject: Re: XFS corruption on 3ware RAID6-volume
From: Erik Gulliksson <erik@xxxxxxxxxxxxxx>
Date: Thu, 24 Feb 2011 11:20:26 +0100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <20110223162316.45a49880@xxxxxxxxxxxxxxxxxxxx>
References: <AANLkTinWByYooMnPL7BryPowDexBeiHJdh3aVh+fdm-a@xxxxxxxxxxxxxx> <20110223154651.54f0a8dc@xxxxxxxxxxxxxxxxxxxx> <AANLkTinsfwr5E7KkffwWOweWJLCmaLnLdtYA4g_m--b0@xxxxxxxxxxxxxx> <20110223162316.45a49880@xxxxxxxxxxxxxxxxxxxx>
Thanks for your comments Emmanuel.

> So the RAID array looks OK, the RAID controller doesn't report any
> particular problem. You said it was reported as 0 K. Where did you see
> 0 K reported?

No I meant it is "OK" with "O" :)

> What gives "dmesg | grep 3w-9xxx" ? and "tw_cli alarms" ? Was the
> filesystem under heavy write when the problem occured ?

The server has been restarted since the problems started, so nothing
notable in "tw_cli alarms" or dmesg. The controller was performing
rebuild on another the other unit when it happened, however I don't
think the actual xfs-filesystem was particularly loaded.

> I'd start with launching a RAID verify, to detect and correct possible
> on-disk coherency problems (it can't hurt anyway):
> tw_cli /c0/u0 start verify
> Then "tail -f /var/log/messages | grep 3w-9xxx" ...
I will try this over night and see if something is reported.

> I suppose that there are no problems to be discovered. Most probably
> IOs to the array were lost because of the bus reset.

That's what I am afraid of too.

<Prev in Thread] Current Thread [Next in Thread>