Re: unknown NMI on AMD Rome

From: Peter Zijlstra
Date: Wed Mar 17 2021 - 06:14:15 EST

On Wed, Mar 17, 2021 at 09:48:29AM +0100, Ingo Molnar wrote:
> >
> So:
> 1215 IBS (Instruction Based Sampling) Counter Valid Value
> May be Incorrect After Exit From Core C6 (CC6) State
> Description
> If a core's IBS feature is enabled and configured to generate an interrupt, including NMI (Non-Maskable
> Interrupt), and the IBS counter overflows during the entry into the Core C6 (CC6) state, the interrupt may be
> issued, but an invalid value of the valid bit may be restored when the core exits CC6.
> Potential Effect on System
> The operating system may receive interrupts due to an IBS counter event, including NMI, and not observe an
> valid IBS register. Console messages indicating "NMI received for unknown reason" have been observed on
> Linux systems.
> Suggested Workaround: None
> Fix Planned: No fix planned

Should be simple enough to disable CC6 while IBS is in use. Kim, can you
please make that happen?