Re: kernel panic at load average of 24 is it acceptable ?

From: Robert Hancock
Date: Mon Jul 17 2006 - 10:29:55 EST


Vikas Kedia wrote:
Here is the output of mcelog:
root@srv1:~# less /var/log/mcelog
MCE 0
CPU 0 0 data cache TSC 6988ae18046
ADDR f87f5ec0
Data cache ECC error (syndrome ce)
bit46 = corrected ecc error
bus error 'local node origin, request didn't time out
data read mem transaction
memory access, level generic'
STATUS 9467400000000833 MCGSTATUS 0
MCE 0
CPU 0 0 data cache TSC 723b38a3633
ADDR 3d9fc0
Data cache ECC error (syndrome ce)
bit46 = corrected ecc error
bit62 = error overflow (multiple errors)
bus error 'local node origin, request didn't time out
data read mem transaction
memory access, level generic'
STATUS d467400000000833 MCGSTATUS 0

Since it shows ECC error is the hypothesis correct that its the RAM
problem and replacing it should solve the problem.

That looks like a data cache ECC error, which would point to a CPU problem in that case..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@xxxxxxxxxxxxx
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/