xfs
[Top] [All Lists]

Re: unkillable process

To: Alan Eldridge <alane@xxxxxxxxxxxx>
Subject: Re: unkillable process
From: Keith Owens <kaos@xxxxxxxxxxxxxxxxx>
Date: Tue, 26 Jun 2001 11:43:37 +1000
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: Your message of "Mon, 25 Jun 2001 18:38:43 -0400." <20010625183843.A2197@xxxxxxxxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
On Mon, 25 Jun 2001 18:38:43 -0400, 
Alan Eldridge <alane@xxxxxxxxxxxx> wrote:
>If cdda2wav is not setuid root, which I really don't want it to be, it
>seems, about 90% of the time, to get locked in "R" state, *not* reading the
>cdrom drive, using as much CPU as it can, and immune to kill -9. It's also
>immune to tracing ... any attempt to trace it results in strace locking up,
>and becoming a member of the undead also. That's pretty neat ... so not only
>can't you kill cdda2wav, you can't kill the strace, either.
>
>How can a process, running as a regular user, in the R state, actively
>spinning its wheels, be immune to a kill -9? And now can it pass that
>immunity on? This is *weird*.

Because the task is running in the kernel, not in user space, 'R' means
running anywhere.  The task has issued a system call and the kernel
code is looping.  All signals, including -9 are noticed while in kernel
space but they are not actioned until the code returns from the kernel
back to user space.  No return, no signal checking.

>How can you debug an uninterruptible, non-waiting process that hangs
>anything trying to attach to it?

With a kernel debugger.  Ensure that your kernel was compiled with
CONFIG_KDB=y.  Turn kdb on, either with CONFIG_KDB_OFF=n, by booting
with "kdb=on" or by 'echo "1" > /proc/sys/kernel/kdb'.  When the task
hangs, invoke kdb with the Pause key (keyboard) or control-A (serial
console).  Use ps to get the process number then 'btp pid' to get a
backtrace on the offending task.

If your disk is still working, before issuing btp, 'set LOGGING=1'.
When you type 'go' to exit kdb, the output will be sent to syslog which
will send it to disk, if it can.  If your disk is not working after
this problem, write the backtrace down by hand or use a serial console
and capture the output there.  Serial consoles are by far the best
option for serious kernel debugging.


<Prev in Thread] Current Thread [Next in Thread>