Re: I do not know if this is the correct place to ask about thisbut...

From: Borislav Petkov
Date: Tue Feb 08 2011 - 05:00:54 EST


On Tue, Feb 08, 2011 at 08:31:50PM +1100, dave b wrote:
> I do not know if this is the correct place to ask about this but...
> I have only seen the following output output twice and both times have
> been when I was running a 2.6.37 kernel.
>
> [152399.816058] [Hardware Error]: MC4_STATUS: Corrected error, other
> errors lost: no, CPU context corrupt: no, CECC Error
> [152399.816075] [Hardware Error]: Northbridge Error, node 0: , core:
> 1L3 ECC data cache error.
> [152399.816086] [Hardware Error]: Transaction: RD, Type: GEN, Cache
> Level: L3/GEN
> [152399.816092] Disabling lock debugging due to kernel taint
> [152399.816099] [Hardware Error]: Machine check events logged
>
> I assume it is just a coincidence. Also, I am not exactly sure what
> the message "means". (Yes I can read the text - but I haven't found
> good documentation which describes the impact it). Note: I submitted a
> bug[0] regarding 'the output' the first time this occurrence.

This is a L3 cache correctable error on an AMD F10h machine I'd guess.

You could go and install x86info from
http://codemonkey.org.uk/projects/x86info/ and do as root

for i in $(seq 0 3); do echo -e "\nCPU$i:"; lsmsr -c $i -a; done > lsmsr.log

[ ($seq 0 3) assumes you have 4 cores, adjust it according to your
machine. Also, you need msr.ko module support, i.e. CONFIG_X86_MSR in
your kernel .config. ]

and send me the lsmsr.log file to check whether there is some more info
about the L3 error.

If you don't have the msr.ko support (or CONFIG_X86_MSR is not set
to y in your config) that tool won't help. In that case, I'd suggest
you upgrade your kernel to 2.6.38-rc4 which is stable enough, enable
CONFIG_X86_MSR and catch the error again. Then retry the small bash
oneliner above again.

That should be all for now, feel free to ask questions should anything
be not clear.

Thanks.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/