At 13:57 2-6-2003 -0500, Austin Gonyou wrote:
Here is the stack trace from this kernel. This is 2.4.18-27 errata + XFS
1.2.0 release. I did re-work the spec file, but only to disable options
we don't want and also re-worked the config to our liking. I have one
patch I apply, but it is to scsi_scan.c for a BLIST entry for our
hardware. Overall, the core kernel config and source is relatively
unchanged. We patch nothing else, and just use RH's src.rpm to create
our i686 rpm. Usually, right before the crash, all the fiber channel
devices go unaccessible, local are still ok, and whole system is XFS,
then the poof. This only seems to occur during the load test I put this
thing through. If anyone would like to see it, I'd be happy to provide
the info.
I'm afraid I can't help you with that. I did patch the kernel but I'm not
skilled enough to actually be a kernel hacker. :-/
Noon else responded yet, which is a shame.
Can you tell me what hardware you are using? It doesn't really sound like a
XFS issue is the problem here, my best guess is that something might be
causing a reset in the fibre channel array which is causing a driver to stall.
I currently have the same problem with megaraid cards without optimizations
and the megaraid v2 driver. During high disk IO is stalls so long that the
write eventually returns a error and the box hangs.
Cheers
--
Seth
It might just be your lucky day, if you only knew.
|