OOM on quotacheck (again?)
Dave Chinner
david at fromorbit.com
Tue Oct 2 15:09:46 CDT 2012
On Tue, Oct 02, 2012 at 06:29:27PM +0200, Volker wrote:
> Hi again,
>
> > Great! That answered all my questions! Thanks a lot!
> >
> > 3.6.0-rc6-x64 ist currently running fine on 6 machines.
>
> just as a follow up i would like to share some info.
>
> The six machines mentioned above are still running fine. So are few more
> we tested with the new kernel. All of the servers tested so far, were
> rebooted immediately after the new 3.6 kernel was installed.
>
> Because of that, we decided to roll out the new kernel to all our
> servers (approximately 330) and have the kernel "sink in" over the next
> few days if the machines get rebooted.
>
> This morning we experienced some problems with the superblock being
> corrupted on 6 machines that had been rebooted during the night. For all
> of them, the following was true:
>
> a) the server was still running the old buggy 2.6.37 and had
> filesystem-troubles on heavy i/o (that was our problem to begin with
> besides the OOM)
>
> b) because of the filesystem-troubles the server had been rebooted by
> our hardware-support-team (sadly not necessarily using sys-requests)
> because the xfs-partition was unresponsive
>
> c) after being rebooted with the new 3.6 kernel, the server complained
> about the super-block of the xfs-partition being corrupted and was not
> able to mount the partition
>
> d) by running xfs_repair -L -P <device> we were able to fix the problem
>
> e) trying a remount of the fixed partition caused a quota-check which
> always ended in a stack-trace, after a reboot, the quota-check was fine
> and the partition successfully mounted
>
> Has anyone ever experienced problems like this updating from an older
> kernel to the current 3.6?
>
> Any Idea what could have caused the bad superblock the 3.6 kernel
> complained about?
>
> Is it possible that the 2.6.37 kernel left a superblock behing that
> could not be recognized by the 3.6 kernel?
>
> If its of any interest, i can supply the stack-traces.
Yes, it is of interest, can you post everything you found out about
the problem? (dmesg, stack traces, repair output, etc).
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list