On Tue, 2003-11-11 at 15:12, Nathan Scott wrote:
[...]
> > > fails with a kernel panic, and any subsequent attempts to mount or
> > > call xfs_repair on that disk just hang.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> If you have just rebooted and run xfs_repair on the block device,
> and it hangs, then this points to either a hardware problem or a
> device driver bug - the XFS kernel code is not involved at all in
> that case.
This is incorrect, and correct, at the same time. I don't think he's
rebooting before he does the xfs_repair. i.e. not mount that disk at
all, then xfs_repair. What I think he should try is to mount that FS ro.
Eric and I saw some stuff recently with a crash we had due to a power
outage where the XFS log was messed up, i.e. xfs_logprint has "ERROR!!!"
in it, but would cause the mount operation to segfault and the kernel
will oops.
So as for correct, yes, xfs_repair will hang becuse that device is now
locked and the only way to un-do that is to reboot. As for the reason
this happens, I don't think too many people are clear on that just yet,
(look for "broken log" in the list history), and the only way you can
test this condition is to not mount that FS at all, except using either
-oro,norecovery, or using -oro.
What I found is that if you mount -oro, and it works, something has been
done to the log, even xfs_logprint doesn't show what, but then you can
mount normally, so long as you see a log-replay in dmesg. If you do,
then you should be good, and no xfs_repair is needed. if you can't mount
-oro, then yes xfs_repair is needed, and you will need to do xfs_repair
-L. Hope this helps. BTW, can someone help this guy dd out his log so he
has a copy to send to this list?
> cheers.
>
> --
> Nathan
--
Austin Gonyou <austin@xxxxxxxxxxxxxxx>
Coremetrics, Inc.
|