Steve Lord schrieb:
>
> On Tue, 2002-03-19 at 13:57, Simon Matter wrote:
> > Steve Lord schrieb:
> > >
> > > On Tue, 2002-03-19 at 13:02, Juri Haberland wrote:
> > > >
> > > > Ok, when you mentioned the SW RAID1 root partition I remembered that I
> > > > have a similar box sitting here. It's also a fresh SGI-RH7.2
> > > > installation with all updates and all partitions are on a SW-RAID1, but
> > > > on SCSI disks, not on IDE disks.
> > > >
> > > > I ran three test like yours (ntsysv (en/disabling time ; reboot)) and
> > > > afterwards I still had all files in /etc/xinetd.d with their proper
> > > > contents. I also had my .bash_history.
> > > > This box runs a 2.4.18-xfs-smp kernel from CVS, checked out on 4th of
> > > > March.
> > > >
> > > > Simon
> > > > what about a recent kernel? 2.4.9-31 is user contributed IIRC. It might
> > > > not be a good choice...
> >
> > You want me to cry, not a good choice, I have contributed them :-)
> > Serious, I'll try a newer kernel as soon as I can.
>
> We are just trying to eliminate variables here - not blaming your
> merging skills. Juri appears to have a similar setup, except he
> does not see the problem with a recent kernel.
I've just tried to make a joke after a frustrating day...
I have checked out 2.4.18-xfs from CVS 8 hours ago. Compiled and tested
and I can confirm the problem is gone. Of course I don't know whether
the problem comes from the 1001 RedHat patches or the kernel or XFS
itself.
Now, I can not upgrade all servers to 2.4.18 and even if I could, I
didn't have time to test it very well so I have to stick with the RH
2.4.9-31 kernel for now (in production). The solution is to modify the
/etc/rc.d/init.d/halt script to call sync several times, BEFORE the
umount -a. If I sync just before remounting / ro, it does not help.
Would be nice if someone could put this into FAQ since it affects all
installaions from the RedHat installer at least on software raid.
The patched halt.gz file is attached to this mail.
-Simon
The patch:
-----------------------------------------------------------------------------
--- halt.orig Thu Feb 7 04:03:37 2002
+++ halt Wed Mar 20 17:33:48 2002
@@ -125,6 +125,23 @@
[ -x /sbin/quotaoff ] && runcmd $"Turning off quotas: " /sbin/quotaoff
-aug
+# Syncing disks, some kernels need this
+echo -n $"Syncing disks: "
+cnt=10
+while [ $cnt -ne 0 ]
+do
+ cnt=$[ cnt - 1 ]
+ sync
+ echo -n "."
+ sleep 1
+done
+if [ "$BOOTUP" = "color" ]; then
+ echo_success || echo_failure
+else
+ echo -n " done."
+fi
+echo
+
# Unmount file systems, killing processes if we have to.
# Unmount loopback stuff first
remaining=`awk '!/^#/ && $1 ~ /^\/dev\/loop/ && $2 != "/" {print $2}'
/proc/mounts`
-----------------------------------------------------------------------------
>
> Eric is slaving away over rpm build tools even as I type, attempting to
> push the latest xfs code into a redhat kernel.
>
> >
> > Just to confirm the IDE thing: I've tried the same on my home server now
> > which is SW-RAID1, but on SCSI disks, and it's the same problem. So
> > nothing with IDE and nothing with write cache.
> >
> > What about sync? I'm still wondering whether it's good to have it in
> > halt? With my modified halt script the problem seems to go away.
>
> Doing the sync in there is fine, does no harm at all.
>
> Steve
>
> --
>
> Steve Lord voice: +1-651-683-3511
> Principal Engineer, Filesystem Software email: lord@xxxxxxx
halt.gz
Description: Binary data
|