Date: Tue, 4 Nov 2003 05:04:04 +0100
On Mon, 3 Nov 2003 22:50:45 +0100
Rumi Szabolcs <rumi_ml@xxxxxxx> wrote:

> I'm using Gentoo xfs-sources-2.4.22 (w/XFS snapshot 2003.10.10)
> and the whole system including the kernel was compiled with
> gcc-3.3.2. The userland was compiled using "-march=pentium4
> -mfpmath=sse -O3 -pipe". The underlying filesystems are 1-2
> gigs in size, created with -l internal,size=16m.

Here are the improvements of the situation:

I have reinstalled the whole system from stage3 (in Gentoo terms
this means a fully precompiled system, with gcc-3.2.3 using
-march=i686 and a bit more conservative optimization settings)
and recompiled the same 2.4.22 kernel as above but this time
with that stock gcc-3.2.3 shipped with Gentoo 1.4 and voila`:
it still exhibits the very same errors! I have to admit that
I was a bit surprised about that, I thought it's probably
gcc-3.3.2's fault optimizing away some crucial stuff but
apparently it's not.

Realizing that, I have compiled xfs-sources-2.4.20-r3 (Gentoo
kernels are usually a bit patched up from the stock with -ac,
EVMS, xfs, etc. but this version works rock solid on two other
production servers of mine, with xfs on all filesystems) on the
same stable userland, and with this kernel the problem disappeared!
It is using the XFS snapshot from 2003.04.07.

So, my conclusion is, this problem is somewhere inside the
kernel, maybe in the stock codebase, or the other patchsets,
but most likely in the XFS part.

I hope someone will comment on this really soon, as the time
is rolling and this machine must go into production within
2-3 days (and should have been in production yesterday already).
Or you can direct me to LKML or gentoo-dev or whatever ML you
think is more appropriate to discuss this problem on.

Szabolcs Rumi

