Re: WARNING in lmce_supported() during reboot.
From: Benjamin Herrenschmidt
Date: Fri Oct 25 2024 - 22:36:06 EST
On Fri, 2024-10-25 at 16:58 -0700, Dave Hansen wrote:
>
> Hi Folks,
>
> We really do need it to be reproduced on mainline. At the very least,
> it would be greatly appreciated if you could summarize what your fork is
> doing and why you don't think it is responsible.
>
> But I don't see how this could be timing related. That MSR gets locked
> early from what I can tell, long before the system would be rebooting.
>
> Your best bet is going to be getting a handle on what
> MSR_IA32_FEAT_CTL's value was after the CPU was brought up and when this
> reboot was attempted. If those values differ, when it got changed.
>
> I'd _suspect_ some kind of BIOS sleep/wakeup wonkiness where something
> forgot to re-lock the MSR.
So far we just happened to notice it in the serial console while doing
other things, I told Kuniyuki to forward it to you in case it rings a
bell. We can definitely do some more systematic attempts at reproducing
but that might take a while.
For me it happened once while rebooting a c5d.4xlarge instance, and not
since (I tried a few reboots), so it could well be something BIOS related
(CC'ing Alex).
This is a KVM/Nitro guest, so the CPU is somewhat virtualized, but
/proc/cpuinfo says: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Cheers,
Ben.