Problem recovering XFS filesystem
Aaron Williams
aaron.w2 at gmail.com
Fri Apr 27 21:04:48 CDT 2012
On Fri, Apr 27, 2012 at 2:31 PM, Michael Monnerie <
michael.monnerie at is.it-management.at> wrote:
> Am Donnerstag, 26. April 2012, 13:00:06 schrieb Aaron Williams:
> > I was able to recover the filesystem.
>
> So your RAID busted the filesystem. Maybe the devs could want an
> xfs_metadump of the FS before your repair, so they can inspect it and
> improve xfs_repair.
>
> Hi Michael,
It appears that way, or it may be the fact that I mounted with nobarrier
and in the process of recovering the RAID the information in the
battery-backed RAID cache got blown away. I have an Areca ARC-1210
controller that was in the process of rebuilding when I attempted to shut
down and reboot my Linux system after I mistakenly unplugged the wrong
drive from my RAID array. I had another drive fail on me and it had
completed rebuilding itself using a hot spare drive. I intended to remove
the bad drive to replace it but disconnected the wrong drive. After
reconnecting the good drive it went on to start rebuilding itself again. At
this point I decided it might be safer to shut down Linux to replace the
drive and thought the RAID controller would pick up where it left off in
rebuilding.
Linux did not shut down all the way however. I don't know if it was waiting
for the array to rebuild itself or if something else happened. Anyway, I
eventually hit the reset button. The RAID BIOS reported it could not find
the array and I had to go about rebuilding the array. I also did a volume
check which found about 70,000 blocks that it repaired. Needless to say I
was quite nervous.
Once that was done Linux refused to mount the XFS partition, I think due to
corruption in the log.
I have an image of my pre-repaired filesystem by using dd and can try and
do a meta dump. The filesystem is 1.9TB in size with about 1.2TB of data in
use.
It looks like I was able to recover everything fine after blowing away the
log. I see a bunch of files recovered in lost+found but those all appear to
be files like cached web pages, etc.
I also dumped the log to a file (128M).
So far it looks like any actual data loss is minimal (thankfully) and was a
good wakeup call to start doing more frequent backups.
I also upgraded xfsprogs from 3.1.6-2.1.2 to 3.1.8 which did a much better
job at recovery than my previous attempt.
It would be nice if xfs_db would allow me to continue when the log is dirty
instead of requiring me to mount the filesystem first. It also would be
nice if xfs_logprint could try and identify the filenames of the inodes
involved.
I understand that there are plans to update XFS to include the UID in all
of the on-disk structures. Any idea on when this will happen?
-Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20120427/d522502b/attachment.htm>
More information about the xfs
mailing list