Re: ECC error in 2.5.64 + some patches

From: Dave Jones (davej@codemonkey.org.uk)
Date: Thu Mar 27 2003 - 19:10:15 EST


On Thu, Mar 27, 2003 at 08:31:20AM -0800, Larry McVoy wrote:
> > | Message from syslogd@slovax at Thu Mar 27 05:53:49 2003 ...
> > | slovax kernel: Bank 1: 9000000000000151
> > You can try the Dave Jones "parsemce" tool on it, from
> > http://www.codemonkey.org.uk/cruft/parsemce.c/
>
> slovax /tmp a.out -b 1 -e 9000000000000151
> Status: (-8070450532247928495) Restart IP valid.
>
> What does that mean?

It means Dave sucks and hasn't done a good enough job on the parser.
parsemce is really really unintuitive to use.

There's some bits missing from your dump. Usually, MCEs look like..

 Sep 4 21:43:41 hamlet kernel: CPU 0: Machine Check Exception: 0000000000000004
 Sep 4 21:43:41 hamlet kernel: Bank 1: f600200000000152 at 7600200000000152

All we have to go on in your example is the bank status code.
(which is -s, not -e. -e would be the 00000000000000004 in the example above. [*])

So, without the missing bits, we have to fake it..

(davej@deviant:davej)$ ./a.out -b 1 -e 1 -s 9000000000000151 -a 0
Status: (1) Restart IP valid.
parsebank(1): 9000000000000151 @ 0
        External tag parity error
        Error enabled in control register
        Memory heirarchy error
        Request: Generic error
        Transaction type : Instruction
        Memory/IO : Reserved

Ignore the Status: line, thats decoded from the (faked) -e 1.

Any the wiser ? 8-) [*]

                Dave

[*] See, unintuitive, evil and nasty.
    Given the time, I'd start over from scratch.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Mar 31 2003 - 22:00:30 EST