Re: [PATCH] Handle Ice Lake MONITOR erratum
From: Andrew Cooper
Date: Thu May 28 2026 - 12:55:53 EST
On 28/05/2026 4:36 am, Dave Hansen wrote:
> On 5/27/26 20:06, Jim Mattson wrote:
>>> The erratum is called ICX143 in the "3rd Gen Intel Xeon Scalable
>>> Processors, Codename Ice Lake Specification Update". It is Intel
>>> document 637780, currently available here:
>>>
>>> https://cdrdv2.intel.com/v1/dl/getContent/637780
>> The erratum says, "Due to this erratum, the processor may hang."
>>
>> We are seeing some Ice Lake Xeon E5 machines panic due to hard lockups, and
>> then the kdump kernel dies with "Fatal machine check from unknown source."
>> Is this behavior consistent with this erratum?
> That sounds like something different to me. I don't remember machine
> checks being implicated for this erratum at all.
>
> My usual rule of thumb is that machine checks mean bad hardware unless
> there's a specific and compelling reason otherwise.
I agree.
The symptoms we found were a hang on boot while bringing up APs, and
there were no machine checks in sight.
~Andrew