Re: [PATCH] perf/x86/intel: Don't update PEBS_ENABLE MSR in NMI context

From: Jim Mattson

Date: Thu Apr 09 2026 - 22:59:11 EST


On Thu, Apr 9, 2026 at 7:24 PM Mi, Dapeng <dapeng1.mi@xxxxxxxxxxxxxxx> wrote:
>
>
> On 3/21/2026 2:58 AM, Jim Mattson wrote:
> > On Fri, Mar 20, 2026 at 8:29 AM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
> >> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> >> index 5027afc97b65..852baf6a05e9 100644
> >> --- a/arch/x86/events/intel/ds.c
> >> +++ b/arch/x86/events/intel/ds.c
> >> @@ -1982,8 +1982,10 @@ void intel_pmu_pebs_enable_all(void)
> >> {
> >> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> >>
> >> - if (cpuc->pebs_enabled)
> >> + if (cpuc->pebs_enabled || cpuc->pebs_stale)
> >> wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
> >> +
> >> + cpuc->pebs_stale = false;
>
> Sashiko comments
>
> "
>
> Can this race with an NMI?
> If a PMI fires exactly between wrmsrq() and cpuc->pebs_stale = false:
> CPU0
> intel_pmu_pebs_enable_all()
> wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>
> <--- NMI (PMI) fires here
> handle_pmi_common()
> ... throttles event ...
> cpuc->pebs_stale = true;
> <--- NMI returns
>
> cpuc->pebs_stale = false;
> Would this clobber the pebs_stale flag set by the NMI, leaving the software
> state desynchronized from the hardware MSR and potentially causing the PMU
> to skip the MSR update on the next enable?
>
> "
>
> It looks reasonable.
> https://sashiko.dev/#/patchset/20260320152928.1916447-1-jmattson%40google.com
>
> Thanks.

I believe this function is only called while the PMU is disabled (on
the path to re-enabling it), so there should be no PMIs.

Sadly, though, this patch has not fixed our problem.