Re: bluesmoke, machine check exception, reboot

From: Alan Cox (alan@lxorguk.ukuu.org.uk)
Date: Tue May 28 2002 - 08:50:37 EST


On Tue, 2002-05-28 at 13:19, Corin Hartland-Swann wrote:
> I have a Dual PIII-1000 running 2.4.18, and am occasionally getting the
> following error:
>
> > CPU 1: Machine Check Exception: 000000000000000004
> > Bank 1: f200000000000115
> > Kernel panic: CPU context corrupt
>
> This results in a hard lock (unable to use magic SysRQ key to sync or
> reboot, etc). I located these errors in arch/i386/kernel/bluesmoke.c in
> the function intel_machine_check(). From what I have read on lkml it is
> probably a result of the processor overheating and causing errors.

It may even be a faulty processor. If you are running the processor to
spec and your heatsink/fan/voltage all check out you may want to see
about getting the CPU replaced. Thats a data cache l1 read error it
appears

> meantime is there anything I can do to get the machine to reboot after the
> panic? After the last time that this happened, I set
> /proc/sys/kernel/panic to 10, but it hasn't happened since then so I can't
> tell whether it will work. The error listed above is the entire error
> before the machine fails - there is no register dump or anything after
> that.

That /proc setting should cause a reboot although after an MCE all
things are a little undefined

> Do you think it will manage to reboot with a hopelessly confused
> processor?

Should do

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri May 31 2002 - 22:00:22 EST