Re: is killing zombies possible w/o a reboot?

From: Michael Clark
Date: Wed Nov 03 2004 - 20:20:11 EST


On 11/04/04 07:07, Bill Davidsen wrote:
DervishD wrote:

Hi Gene :)

* Gene Heskett <gene.heskett@xxxxxxxxxxx> dixit:

Then the children are reparented to 'init' and 'init' gets rid
of them. That's the way UNIX behaves.


Unforch, I've *never* had it work that way. Any dead process I've ever had while running linux has only been disposable by a reboot.



Well, you know, shit happens... Anyway, could you define 'dead'?
Because if you're talking about zombies whose parent dies, they're
killable easily: just wait until init reaps them (usually in less
than 5 minutes since they dead). If you are talking about zombies who
has their parent alive, then it's a bug in the application, not the
kernel. In fact I wouldn't like if the kernel reaps my children
before I do, just in case I want to do something.

If you're talking about unkillable processes (those stuck in
disk-sleep state), you're right: only rebooting can kill them
(although sometimes they go out of D state and die normally). Bad
luck for you if any dead process you've ever had while running linux
has been of this kind :(


That often seems to be the case, the kernel thinks there's an i/o going on which isn't, and doesn't time it out. It would be nice if there were a way to get the kernel to abort all outstanding i/o on kill -9, but I'm sure if it were easy it would have happened. Timeouts in the application are useful, but in some cases I believe the process dies because it detects a long i/o time but has nothing to do but terminate, which creates the zombie.

It could be any driver code that uses uninterruptible sleeps rather
than interruptible sleeps I believe. If a process is doing a read or
write to one of these devices and it stays stuck in kernel code with
TASK_UNINTERRUPTIBLE and never gets it's expected wake up, then the
signal will never be delivered and the process is stuck indefinately.
The buggy driver code needs to be fixed (either to use interruptible
sleeps and handle the signals or to imlement some sort of timeout).

~mc
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/