Re: [Suspend-devel] Fwd: Kernel Oops with 2.6.23

From: Ritesh Raj Sarraf
Date: Mon Dec 17 2007 - 11:18:32 EST


On Tuesday 04 December 2007, Pavel Machek wrote:
> On Mon 2007-12-03 06:01:26, Ritesh Raj Sarraf wrote:
> > On Sunday 02 December 2007, Pavel Machek wrote:
> > > killall -9 pulseaudio. If pulseaudio is not dead within 60 seconds,
> > > you hit a kernel bug. If it needs suspend to be reproduced, you
> > > probably have a suspend bug.
> >
> > Hi Pavel,
> >
> > Something similar to this are multiple cases where the kernel is not able
> > to kill a process at all.
> >
> > A good example is an application pumping IO to a multipathed device. When
> > all the paths to the multipathed devices go down, and you'd like to kill
> > the process, there is no way left to do it. In fact, a reboot also
> > doesn't work in such cases. Reboot gets hung in midway trying to kill the
> > process. The user is left to do a hard reset of the machine.
> >
> > In situations like these, the processes go into D state.
> >
> > Here's what the manpage of ps says:
> >
> > PROCESS STATE CODES
> > Here are the different values that the s, stat and state output
> > specifiers (header "STAT" or "S")
> > will display to describe the state of a process.
> > D Uninterruptible sleep (usually IO)
> >
> > Does it mean that processes in D state are excluded by the kernel from
> > being killed ? Or is it still a kernel bug ?
>
> Still a kernel bug. Processes should not stay in D state for long.
> Pavel

Hi Pavel,

Sometime back we discussed about 'D' state processes which are not killed by
the kernel by any signal.

Here's a bugzilla detailing the symptom.
https://bugzilla.redhat.com/show_bug.cgi?id=419581
[I/O Processes don't get killed when all the paths to the LUN are down]

It is still being assumed as working as designed.

Ritesh
--
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."

Attachment: signature.asc
Description: This is a digitally signed message part.