2.4.18-24 SMP Machine stuck in zombie state after kernel Oops

From: yuval yeret (yuval_yeret@hotmail.com)
Date: Wed Jul 16 2003 - 02:54:42 EST


Hi,

Tried to find information about a kernel OOPS I've seen a lot of times
already on 4 different machines - but nothing seems to be said about this in
the list archives or anywhere else for that matter.

We are running 2.4.18-24 on SMP machines with 2CPUs and hyperthreading
(SuperMicro Xeon servers) and doing heavy IO to disk and networking. (Qlogic
HBAs and Intel e1000 NICs are used)

At some point the machine oopses (no scenario except heavy nfs-server like
load):

The oops doesn't appear in logs or on the console, but I've been able to use
the diagnostic keys to get the following information:

right ALT+Scroll lock

CPU 0 : swapper

[<c0106f32>] (0xc0323fc4)) default_idle
[<c0105000>] (0xc0323fd4)) empty_zero_page

CPU 1 : swapper

(<c0106f32>) (0xc 6597fb0) - default_idle
(<c011d29b>) (0xc 6597fd0) - out_of_line_bug
(<c011d449>) (8xc 659ffc) - printk

CPU 3 : swapper

[<c0106f32>] (0xc8257fb0)) default_idle
[<c011d449>][0xc8257fd0)) printk

CTRL +Scroll lock

XINETD
[<c0108dc0>]sys_sigaltstack
[<c01151b1>]flush_tlb_page
[<c0115140>]flush_tlb_page
[<c014086e>]shmem_file_setup
[<c0131109>]do_generic_file_read
[<c013121e>]generic_file_read
[<c01310b0>]do_generic_file_read
[<c0142aa6>]sys_read
[<c0108ccf>]sys_sigaltstack

CROND

[<c01198f1>]wait_for_completion
[<c011c47a>]do_fork
[<c0107718>]dump_thread
[<c0108ccf>]sys_sigaltstack

After the oops networking stack continues to function, some running daemons
continue to work (I'm seeing network traffic from the machine which
indicates that clearly), but login into the node is not possible via
console, ssh, rsh, and the majority of the application processes are dead.

Any information / pointers will be appreciated.

If any information is missing or anything I should do to help analyze next
time it happens tell me as well.

Thanks,

--
Yuval Yeret
Exanet
yuval@exanet.com
http://www.exanet.com
Tel.  972-9-9717782
Fax. 972-9-9717778

_________________________________________________________________ The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jul 23 2003 - 22:00:23 EST