Re: [RFC] x86, NMI, Treat unknown NMI as hardware error

From: Huang Ying
Date: Fri May 20 2011 - 04:13:48 EST


Hi, Don,

On 05/18/2011 03:07 AM, Don Zickus wrote:
> On Tue, May 17, 2011 at 11:18:59AM -0700, Andi Kleen wrote:
>>> Random thought, in the Firmware first mode of HEST (which is the only way
>>> GHES records get produced??), does an SCI happen first to jump into the
>>> firmware for processing, then an NMI?
>>
>> Either that or there is a separate service processor which handles it.
>> Presumably it depends a lot on the particular system.
>
> Ah interesting. I was going to suggest somehow setting a bit when an SCI
> comes in and check that bit in the unknown NMI path as a possible hint
> that the NMI might be related to HEST (sorta how we flag unknown NMIs in
> the perf code).
>
> It was just an idea. Obviously a service processor will make that more
> difficult. :-)

Hmm, what's the conclusion? Do you think unknown NMI should be seen as
hardware error? At least on some white listed machines?

Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/