Re: [PATCH] x86, mce: disable MCE if cpu has no MCE banks

From: Hidetoshi Seto
Date: Wed Oct 28 2009 - 04:19:39 EST


Andi Kleen wrote:
> Hidetoshi Seto wrote:
>> Without disabling, what can we do on MCE with no bank?
>
> Nothing, but is it really worth adding a special case?

If question were:
- is it really worth to support this special environment,
"MCE-capable but no MCE banks" ?
then I'd like to say no.

So I suggested to disable MCE on this uncertain environment.
Or we will end up adding more codes for special cases...

>> I found that do_machine_check() does nothing if banks==0 ... it is better
>> to let system to panic with "Machine check from unknown source"?
>
> IMHO yes. In this case the system must be very confused and panic is the
> best you can do. Otherwise it won't do anything interesting anyways.

Agreed, but this is also a special case.
Not depending on the real number of banks, confused system could fail to
get the value from memory... Humm, in theory MCE handler must be
implemented carefully, but I bet the confused value will not be always 0,
... is it worth to do?

>>>> Hum, I suppose the line for CPU 0 was slightly different from others,
>>>> because SHD means "this bank is shared bank and controlled by other".
>>>> Maybe:
>>>> CPU 0 MCA banks CMCI:0 CMCI:1 CMCI:2 CMCI:3 CMCI:5 ... CMCI:21
>>>>
>>>> But I agree that we could some work for this messages...
>>>> Is it better to change the message level to debug from info?
>>> Can be made INFO yes, but I would prefer not removing them
>>> from the dmesg for now.
>>>
>>> Perhaps they could be also compressed a bit like SRAT.
>>
>> Like SRAT? I could not catch the meaning ... For example?
>
> See the recent patches from David Rientjes in the same original thread.

I found it, thanks.

So I suppose your idea is like:
CPU 0 MCA banks CMCI:{0-3,5-9,12-21} POLL:{4,10,11}
CPU 1 MCA banks SHD:{0,1,6-9,12-21} CMCI:{2,3,5} POLL:{4,10,11}
right?

IMHO the format I suggested is better to read, as far as banks is
not so big number.
CPU 0 MCA banks map : CCCC PCCC CCPP CCCC CCCC CC
CPU 1 MCA banks map : ssCC PCss ssPP ssss ssss ss


Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/