Re: [PATCH v5 07/13] KVM: x86/pmu: Disable counters based on Host-Only/Guest-Only bits in SVM
From: Yosry Ahmed
Date: Tue May 05 2026 - 16:26:14 EST
> > We'd also probably want to rename it. I would honeslty just use 'nested'
> > instead of 'locally_disabled' and 'mode_specific_enables' as that's the
> > only current user.
>
> But it's not strictly nested specific. E.g. even the vCPU doesn't support nested
> virtualization, a (stupid) guest can still set HOST_ONLY and effectively disable
> the counter, thanks to the bizarro behavior of HOST_ONLY when EFER.SVME=0.
Well, that is still 'nested'-related behavior. Anyway, that's
irrelevant, I like your suggestion below.
> > Something like this with your proposed amd_pmc_is_locally_disabled()
> > above, which is similar to the kvm_pmu_ops.mediated_reprogram_counter()
> > implementation in v4 except that the vendor-specific callback is more
> > targeted:
> >
> > static void pmc_check_nested_disabled(struct kvm_pmc *pmc)
> > {
> > if (!(pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE))
> > return;
> >
> > if (!test_bit(pmc->idx, &pmu->pmc_has_nested_enables))
> > return;
> >
> > pmc->is_nested_disabled = kvm_pmu_call(pmc_is_nested_disabled)(pmc);
> > if (!pmc->is_nested_disabled)
> > pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
>
> I don't want multiple paths mucking with eventsel_hw's ARCH_PERFMON_EVENTSEL_ENABLE.
> That's how we end up with ordering bugs. E.g. pmc_check_nested_disabled() *must*
> be called after kvm_mediated_pmu_refresh_event_filter(), which is gross and brittle.
>
> E.g. something like so. We can even optimize away the PMU filter lookup (which
> I suspect would be more expensive in the common case?) if the counter is disabled
> thanks to H/G bits.
>
> diff --git arch/x86/kvm/pmu.c arch/x86/kvm/pmu.c
> index 7b2b4ce6bdad..4ca4a0078d35 100644
> --- arch/x86/kvm/pmu.c
> +++ arch/x86/kvm/pmu.c
> @@ -530,21 +530,24 @@ static bool pmc_is_event_allowed(struct kvm_pmc *pmc)
> return is_fixed_event_allowed(filter, pmc->idx);
> }
>
> -static void kvm_mediated_pmu_refresh_event_filter(struct kvm_pmc *pmc)
> +static void kvm_mediated_pmu_refresh_eventsel_hw(struct kvm_pmc *pmc)
> {
> - bool allowed = pmc_is_event_allowed(pmc);
> struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>
> if (pmc_is_gp(pmc)) {
> pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> - if (allowed)
> + if (!test_bit(pmc->idx, &pmu->pmc_has_mode_specific_enables) &&
> + kvm_pmu_call(pmc_is_locally_disabled(pmc)))
> + return;
> +
> + if (pmc_is_event_allowed(pmc))
> pmc->eventsel_hw |= pmc->eventsel &
> ARCH_PERFMON_EVENTSEL_ENABLE;
> } else {
> u64 mask = intel_fixed_bits_by_idx(pmc->idx - KVM_FIXED_PMC_BASE_IDX, 0xf);
>
> pmu->fixed_ctr_ctrl_hw &= ~mask;
> - if (allowed)
> + if (pmc_is_event_allowed(pmc))
> pmu->fixed_ctr_ctrl_hw |= pmu->fixed_ctr_ctrl & mask;
> }
> }
>
> Actually, this code is being silly. It unnecessarily performs the PMU filter
> lookup when the _guest_ disables the counters via eventsel. If you first "fix"
> that by querying pmc_is_locally_enabled() before checking the event filter, then
> you won't even need to touch that code when introducing H/G bits, because it will
> Just Work thanks to pmc_is_locally_enabled() doing the right thing (as it should,
> becase as mentioned early, that logic is architectural).
Believe it or not, I thought about adding this in
kvm_mediated_pmu_refresh_event_filter() from the beginning for the
same reason, to have a single path that updates eventsel_hw. I didn't
pursue it because I started overthinking the new function name and
ended up ditching the whole idea -- which is extremely silly now that
I think about it :)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 7b2b4ce6bdad..c84f2f971ab1 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -530,22 +530,23 @@ static bool pmc_is_event_allowed(struct kvm_pmc *pmc)
> return is_fixed_event_allowed(filter, pmc->idx);
> }
>
> -static void kvm_mediated_pmu_refresh_event_filter(struct kvm_pmc *pmc)
> +static void kvm_mediated_pmu_refresh_eventsel_hw(struct kvm_pmc *pmc)
> {
> - bool allowed = pmc_is_event_allowed(pmc);
> + bool allowed = pmc_is_locally_enabled(pmc) && pmc_is_event_allowed(pmc);
> struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>
> if (pmc_is_gp(pmc)) {
> - pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> if (allowed)
> - pmc->eventsel_hw |= pmc->eventsel &
> - ARCH_PERFMON_EVENTSEL_ENABLE;
> + pmc->eventsel_hw |= ARCH_PERFMON_EVENTSEL_ENABLE;
> + else
> + pmc->eventsel_hw &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
> } else {
> u64 mask = intel_fixed_bits_by_idx(pmc->idx - KVM_FIXED_PMC_BASE_IDX, 0xf);
>
> - pmu->fixed_ctr_ctrl_hw &= ~mask;
> if (allowed)
> - pmu->fixed_ctr_ctrl_hw |= pmu->fixed_ctr_ctrl & mask;
> + pmu->fixed_ctr_ctrl_hw |= mask;
> + else
> + pmu->fixed_ctr_ctrl_hw &= ~mask;
> }
> }
>
> @@ -558,7 +559,7 @@ static int reprogram_counter(struct kvm_pmc *pmc)
> u8 fixed_ctr_ctrl;
>
> if (kvm_vcpu_has_mediated_pmu(pmu_to_vcpu(pmu))) {
> - kvm_mediated_pmu_refresh_event_filter(pmc);
> + kvm_mediated_pmu_refresh_eventsel_hw(pmc);
> return 0;
> }
>
> > Also, would you rather I send a new version with everything, or do you
> > want to pick up some of the patches in this version independently?
>
> Uh, probably just send a new version. I doubt I'll get through the first few
> patches before you're ready to send the next version.
Makes sense, thank you!