On Wed, 14 Nov 2001 15:31:51 +0200 (EET),
Mihai RUSU <dizzy@xxxxxxxxx> wrote:
>we have a linux 2.4.10-XFS running on a 2xPIII (1GHz) + 1 Gb RAM + 4 x
>36Gb SCSI HDD (on a Mylex AccellRAID170 RAID-5 array).
>ill give more details about this. some processes are starting to hang in D
>state (and dont die with kill -9). if i leave it like this my load average
>its increasing every minute (i got even a 200.0).
Load average is meaningless when processes are stuck in D state. They
count towards the "load" but are not really doing any work.
>if i try to use sysrq+s (for emergency sync) it starts sync-ing on other
>partitions but the syncing process hangs on the XFS partition
That points to (but does not guarantee) a problem in XFS code.
The first step is to ensure that kdb is compiled in and is active, the
boot messages must say "kdb version 1.9 by Scott Lurndal, Keith Owens".
Also ensure that you compiled and booted with a serial console
(see Documentation/serial-console.txt). Capture the output via the
serial console, using whichever comms program you prefer.
When the problem occurs, drop into kdb, using Pause on the normal
keyboard or control-A on the serial console (note: a program such as
getty must be reading from the serial console). Identify the process
that is stuck in D state and do 'btp <process-id>'. That will identify
where the process is hung, it is probably waiting on a lock.
If you can, do 'bta' and capture all the output but do not send it yet.
Send the btp output, that will help us identify the problem. The bta
output may be requried to find out why the task is not moving.
If you are new to kdb, see Documentation/kdb, it contains several man
pages.
|