The sequence of events is:
Machine locks up - probably related to some Xwindows/application problem
(we use the Nvidia drivers)
Machine is reset
Fails to mount the root (XFS) file system - either with an oops of some
error telling us the file system is corrupt etc.
Attempts to reset again produce same results above.
Booting in rescue mode, running 'xfs_repair -L' and rebooting "fixes"
the problem. xfs_repair finds some lost file and puts them in lost+found
- these are usually files from /tmp or /var/tmp.
This doesn't happen every time a machine locks up, but it occurs may be
once a week or so on one or another of our 60 or so workstations.
Stephen Lord wrote:
> On Mon, 2002-10-07 at 07:45, James Pearson wrote:
> > We have a number of workstations running RedHat 7.2 with a 2.4.18 XFS
> > 1.1 kernel - every now and then a (different) machine will crash/hang
> > and fail to boot with a kernel oops and/or with XFS errors when it tries
> > to mount the root file system.
> > The fix is to boot from floppy/CD in rescue mode and run 'xfs_repair -L'
> > on the root partition. The root file system is them mountable and the
> > machine reboots OK.
> > I don't have exact error messages (don't have time to write down the
> > exact errors, as the priority is to get the machine up and running ...)
> > Is this a known problem? If it isn't, I'll attempt to get more
> > information when it happens again.
> > James Pearson
> Actually, a change just went into the cvs tree this weekend which might
> be related to this, there is some zeroing of part of the log which is
> always supposed to happen during mount. For a readonly mount this was
> not happening - and the root is mounted this way. Should the machine
> be shutdown and rebooted very shortly after this there is a possibility
> of the second mount getting confused by the log contents.
> Is there any way this could be what is happening? Is this happening
> on the second of two boots which are close together?
> Currently there is no way to get this code except from a cvs kernel,
> we just put out some images of the first alpha of xfs 1.2, the next
> spin of these should include this fix (hint hint Eric).