redhat 2.4.18-14 kernel weird behaviour

From: Frank Dekervel (kervel-lkml@drie.kotnet.org)
Date: Sat Jun 14 2003 - 10:26:23 EST


hello,

i see the following on a redhat8 2.4.18-14 machine: a 'snmpdm' process (part
of ciagent) gets stuck, using 100% cpu (100% 'system' usage in top).
after that, the machine becomes unusable: impossible to ssh to the machine,
daemons like mysql and oracle become non-functional , ...
thats the first weird part, but there is more:
in one case a hard kill (killall -9 snmpdm) did not have immediate effect: the
process only got killed several hours after the hard kill (i don't know how
long, when the kill failed my session froze, and i gave up. next morning
everything was normal again, and snmpdm was killed). in another case the hard
kill worked, and after killing snmpdm the system became normal again
immediately (frozen ssh sessions came to life again, ...)

i managed to kill the process via a ssh session that did not freeze for some
reason. when the system returned to normal, i could see that the system load
was very high, so the 'frozen' processes were probably just stuck in D state
(the only process taking CPU was snmpdm). disk starvation or something ?

Did someone experience the same problems (using rh8 kernel) ? i searched the
redhat bugzilla database and i checked the kernel errata, but i could not
find anything relevant... And someone has a clue on how to find the cause of
this problem ? (its not easy to reproduce, altough it happent 3 times in 3
weeks time or so)

thanks in advance,
greetings,
frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:40 EST