What to do when... xfs_repair hangs?
Sean Caron
scaron at umich.edu
Fri May 30 14:49:13 CDT 2014
Hi all,
Long story short, we have a big array formatted as XFS, we had a machine go
down hard maybe a month, month and a half ago... when it came back up, XFS
faulted out when we attempted to mount the filesystem; it complained the
log was bad or something... I did a dry run of xfs_repair (-L) and it
looked pretty bad, so we mounted up the filesystem read-only, ran a
backup... I think we got pretty much everything out OK except maybe files
that were open at the time of the crash.
Now with a backup in hand, we kicked off xfs_repair "for real"... it ran
for a while and did its thing, but now it appears to be stuck at the stage -
- agno = 436
rebuilding directory inode ...
rebuilding directory inode ...
rebuilding directory inode ...
...
- traversal finished ...
- moving disconected inodes to lost+found ...
disconnected inode 1109099673,
and then it just stops. I don't know how long its been sitting like that,
but it hasn't moved in the last hour or two. I assume that's not good...
Interestingly when we ran a dry run of xfs_repair (-L) it got all the way
through; it never hung up at any point. Not sure why it would start to hang
up, once it gets run "for real".
This machine is in single-user-mode, I have exactly 24 lines of console
with no scrollback buffer, no other tty available besides that which I'm
running xfs_repair on, the system console.
Running Linux kernel 3.4.61, Ubuntu 12.04 LTS 64-bit with whatever their
current xfsprogs is.
This is a bit of an exceptional situation for me; I've never seen
xfs_repair just hang outright. I hoped I could maybe get some feedback from
the experts here... what should I do?
Try to Control-C out of the xfs_repair and ... re-run it?
Should I just quit wasting time at this point, wipe out the filesystem,
reformat, then just start the long process of restoring from the backups?
Original plan was just to run xfs_repair, see what happened and pull from
backups as required to fix damage. Perhaps we should just cut to the chase,
rebuild, and restore everything? Probably the file system would be
ultimately healthier starting from scratch, than what xfs_repair leaves
behind?
Any insight would be very much appreciated!
Thanks,
Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20140530/aee3f6cc/attachment.html>
More information about the xfs
mailing list