2.4 lockup issue (flush_tlb_all)

From: Thomas Oulevey
Date: Wed Nov 03 2004 - 06:38:53 EST


Hello,

We are experiencing some lockup problems with our SMP configuration.
Here are the details :
- The computers lockup with no relevant logs.
- The kernel still replies to ping but higher level services are not
responding.
- After few hours (5-8), the kernel answers again and the load is around
40 then decreasing.

We manage to get some SysRq showPc output (screenshot :
http://www.elonex.ch/shot/)
According to the basic sysreq debugging, the problem seems to be related
to the function flush_tlb_all, and it is triggered with a write or read
(local or on nfs sometimes).

I looked at the LKML, and didn't find any known issues.
Maybe it has been corrected but not backported by redhat !
I'll appreciate any help.

Thank you in advance.

detailed configuration :
---------------
Processor : 2 x 2.8Ghz Pentium Xeon
Motherboard : Intel se7501cw2
Memory : 4 x 512MB DDR 266 ECC registered
Kernel : 2.4.20-31 (Redhat 7.3 with updates)


PLEASE CC the answers/comments

--
Thomas OULEVEY System Engineer
Elonex Switzerland Email: thomas.oulevey@xxxxxxxxx
Switzerland

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/