Re: [PATCH] Handle Ice Lake MONITOR erratum

From: Jim Mattson

Date: Thu May 28 2026 - 13:51:58 EST


On Thu, May 28, 2026 at 9:03 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 28/05/2026 4:36 am, Dave Hansen wrote:
> > On 5/27/26 20:06, Jim Mattson wrote:
> >>> The erratum is called ICX143 in the "3rd Gen Intel Xeon Scalable
> >>> Processors, Codename Ice Lake Specification Update". It is Intel
> >>> document 637780, currently available here:
> >>>
> >>> https://cdrdv2.intel.com/v1/dl/getContent/637780
> >> The erratum says, "Due to this erratum, the processor may hang."
> >>
> >> We are seeing some Ice Lake Xeon E5 machines panic due to hard lockups, and
> >> then the kdump kernel dies with "Fatal machine check from unknown source."
> >> Is this behavior consistent with this erratum?
> > That sounds like something different to me. I don't remember machine
> > checks being implicated for this erratum at all.
> >
> > My usual rule of thumb is that machine checks mean bad hardware unless
> > there's a specific and compelling reason otherwise.
>
> I agree.
>
> The symptoms we found were a hang on boot while bringing up APs, and
> there were no machine checks in sight.

Thanks for the confirmation!