Re: [BUG] machine check Oops on Alpha

From: Maciej W. Rozycki
Date: Tue Apr 19 2016 - 20:46:26 EST


On Tue, 19 Apr 2016, Bob Tracy wrote:

> > 4.6.0-rc4 build complete, including suggested (by Alan Young) "Verbose
> > Machine Checks" option set to level 2 by default. System rebooted, and
> > now we wait... Thanks for everyone's continued patience.
>
> Within three minutes of rebooting, I got a machine check, but perhaps
> significantly, no "Oops". I'm guessing the only reason I'm seeing the
> ECC errors now (haven't seen them before) is because of the stepped-up
> debug output. Syslog output attached...

If this is a code generation bug, which I now suspect even more highly
than before, then the debug verbosity configuration change may well have
made the compiler behave indeed. As you can see from the log the logout
area pointer is not null:

machine check: LA: fffffc0000006000

(of course the lone insertion of this `printk' call may have covered the
bug, regardless of the debug verbosity change). Consequently further
information is printed -- the:

CIA machine check: vector=0x630 pc=0xfffffc00005b66ac code=0x86

line would have been printed anyway -- in fact the Oops previously
happened in an attempt to retrieve `code' to print with this line.

I can see if I can find anything suspicious there if you send me original
copies (i.e. those that oopsed) of arch/alpha/kernel/irq_alpha.o and
arch/alpha/kernel/core_cia.o.

> Machine has been stable since the machine check. Kernel is 4.6.0-rc4.

Yeah, it was a correctable error after all.

Maciej