Re: x86/mce/therm_throt incorrect THERM_STATUS_CLEAR_CORE_MASK?
From: srinivas pandruvada
Date: Thu Jun 02 2022 - 12:26:12 EST
On Thu, 2022-06-02 at 18:18 +0200, Arnd Bergmann wrote:
> On Thu, Jun 2, 2022 at 5:52 PM srinivas pandruvada
> <srinivas.pandruvada@xxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, 2022-06-02 at 11:19 +0200, Arnd Bergmann wrote:
> > > I have a Xeon W-2265 (family 6, model 85, stepping 7) that
> > > started
> > > constantly spewing messages from the therm_throt driver after one
> > > core overheated:
> > >
> > I think this is a Cascade Lake system. Have you tried the latest
> > micro-
> > code?
>
> Thanks for your quick reply. I have installed the latest microcode
> 0x5003302
> now (manually, because the version provided by the distro was still
> using
> version 0x5003102).
>
> After that, I tried writing the value 0x2a80 from userspace, and
> that did not cause a trap, so I assume that fixed it.
>
Thanks for reporting.
I am aware of this issue and should be fixed by microcode update.
Thanks,
Srinivas
> It's hard to be sure, as the system has only run into the broken
> state twice during its life, and now it's fine. I'll reply here if it
> ever comes back with the new microcode.
>
> Thanks a lot!
>
> Arnd