Re: [PATCH v2 0/4] perf/x86: Don't write PEBS_ENABLED on KVM transitions
From: Peter Zijlstra
Date: Fri Apr 24 2026 - 08:25:26 EST
On Fri, Apr 24, 2026 at 08:17:42PM +0800, Mi, Dapeng wrote:
>
> On 4/24/2026 12:16 AM, Peter Zijlstra wrote:
> > On Thu, Apr 23, 2026 at 08:03:36AM -0700, Sean Christopherson wrote:
> >> Testing this against our "PEBS_ENABLED is stuck" reproducer is (still) a work
> >> in-progress (largely because the "reproducer" is currently "throw the kernel in
> >> a big test pool"), i.e. I don't know if this actually resolves the problems we
> >> are seeing. But even if it doesn't fully resolve our woes, it seems like a
> >> no-brainer improvement, and if we're missing something with respect to "stuck"
> >> PEBS_ENABLED, it'd be nice to get feedback/input asap.
> >>
> >> Note, if the throttling theory is correct (which is looking unlikely at the
> >> moment), then there are likely more fixes that need to be done, e.g. for CPUs
> >> without isolation, and/or if PERF_GLOBAL_CTRL can be modified from NMI context
> >> too.
> > Throttle does: pmu->stop() := x86_pmu_stop() -> intel_pmu_disable_event()
> >
> > Which in turn should:
> >
> > x86_pmu_disable_event()
> > wrmsrq(config_base, config & ~EN);
> > x86_pmu_pebs_disable() := intel_pmu_pebs_disable()
> > wrmsr(PEBS_ENABLE, pebs_enabled & ~(1<<idx));
> >
> > So that's just the counter EN bit and PEBS_ENABLED cleared. However, if
> > this is from PMI, then the PMI handler should also update GLOBAL_CTRL --
> > provided it wasn't 0.
> >
> > See intel_pmu_handle_irq():
> >
> > if (pmu_enabled)
> > __intel_pmu_enable_all()
> > wrmsrq(GLOBAL_CTRL, intel_ctrl);
> >
> Yes, currently all valid bits in GLOBAL_CTRL would be set by default on
> Intel platforms. IIUC, this issue looks more like a race condition between
> Perf and KVM.
>
> 1. KVM saves the value of host PEBS_ENABLE before VM-entry.
>
> 2. PMI is triggered and interrupts the upcoming VM-entry. PEBS events are
> throttled and PEBS_ENABLE MSR is updated in the PMI handler, then the KVM
> saved host PEBS_ENABLE value gets stale.
>
> 3. VM entry continues and then the next VM-exit occurs, the stale
> PEBS_ENABLE value is restored.
>
> 4. The PEBS_ENABLE MSR keeps the stale value until next write.
>
> Seems an alternative way to fix this issue is to disable the PMU (Clearing
> GLOBAL_CTRL) before KVM saving the PMU MSRs?
Yes, that would seem a prudent thing to do.