Re: [PATCH v2 4/4] perf/x86: KVM: Have perf define a dedicated struct for getting guest PEBS data

From: Jim Mattson

Date: Mon May 04 2026 - 15:43:25 EST


On Mon, May 4, 2026 at 10:19 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, May 01, 2026, Jim Mattson wrote:
> > On Thu, Apr 23, 2026 at 8:03 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> > > index 7403ca721b6a..04d9c51335d7 100644
> > > --- a/arch/x86/events/intel/core.c
> > > +++ b/arch/x86/events/intel/core.c
> > > @@ -14,7 +14,6 @@
> > > #include <linux/slab.h>
> > > #include <linux/export.h>
> > > #include <linux/nmi.h>
> > > -#include <linux/kvm_host.h>
> > >
> > > #include <asm/cpufeature.h>
> > > #include <asm/debugreg.h>
> > > @@ -4992,11 +4991,11 @@ static int intel_pmu_hw_config(struct perf_event *event)
> > > * when it uses {RD,WR}MSR, which should be handled by the KVM context,
> > > * specifically in the intel_pmu_{get,set}_msr().
> > > */
> > > -static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> > > +static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr,
> > > + struct x86_guest_pebs *guest_pebs)
> > > {
> > > struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > > struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
> > > - struct kvm_pmu *kvm_pmu = (struct kvm_pmu *)data;
> > > u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
> > > u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> > > u64 guest_pebs_mask = pebs_mask & ~cpuc->intel_ctrl_host_mask;
> > > @@ -5052,7 +5051,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> > > * wrong counter(s). Similarly, disallow PEBS in the guest if the host
> > > * is using PEBS, to avoid bleeding host state into PEBS records.
> > > */
> > > - guest_pebs_mask &= kvm_pmu->pebs_enable & ~kvm_pmu->host_cross_mapped_mask;
> > > + guest_pebs_mask &= guest_pebs->enable & ~guest_pebs->cross_mapped_mask;
> >
> > It would be helpful to save this mask somewhere, so that it can be
> > used when calculating guest_pebs_idxs in x86_pmu_handle_guest_pebs().
> > I think that code needs a fix similar to the one in commit
> > 58f6217e5d01 ("perf/x86/intel: KVM: Mask PEBS_ENABLE loaded for guest
> > with vCPU's value.").
>
> Blech. This all feels like a losing game of whack-a-mole. Proxying the PMU
> through perf is a mediocre approximation for non-PEBS events, and it seems like
> it's downright awful for PEBS. Ideally, we'd just rip out all of the perf-based
> PEBS virtualization support, and only support PEBS through the mediated PMU. :-/
>
> Absent drastic measures though, saving the effective guest_pebs_enable in the
> per-CPU tracking does seem like the least awful approach. Though I don't quite
> understand why we can't use GLOBAL_STATUS for x86_pmu_handle_guest_pebs(). I.e.
> what happens if x86_pmu_handle_guest_pebs() only processes counters that actually
> got marked as overflowing?

x86_pmu_handle_guest_pebs() is called in the path where we are
handling GLOBAL_STATUS bit 62 (GLOBAL_STATUS_BUFFER_OVF_BIT).
Individual PEBS PMCs are not configured to raise PMI on overflow.

Moreover, I believe kvm_arch_pmi_in_guest() returns a false positive
if perf is using NMI for PMI, and a PMI occurs while between
kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ) and
kvm_after_interrupt(vcpu).

> Regardless, I'm not going to try and address that mess in this series. AFAICT,
> it's not urgent, and I don't want to snowball into a broader cleanup.

I'll let you know if our instrumentation changes that sense of urgency. :)