Re: [PATCH v4 4/6] KVM: x86/pmu: Re-evaluate Host-Only/Guest-Only on nested SVM transitions

From: Jim Mattson

Date: Thu Apr 09 2026 - 00:59:34 EST


On Wed, Mar 25, 2026 at 8:12 PM Yosry Ahmed <yosry@xxxxxxxxxx> wrote:
>
> Reprogram all counters on nested transitions for the mediated PMU, to
> re-evaluate Host-Only and Guest-Only bits and enable/disable the PMU
> counters accordingly. For example, if Host-Only is set and Guest-Only is
> cleared, a counter should be disabled when entering guest mode and
> enabled when exiting guest mode.
>
> Having one of Host-Only and Guest-Only set is only effective when
> EFER.SVME is set, so also trigger counter reprogramming when EFER.SVME
> is toggled.
>
> Track counters with one of Host-Only and Guest-Only set as counters
> requiring reprogramming on nested transitions in a bitmap. Use the
> bitmap to only request KVM_PMU_REQ if some counters need reprogramming,
> and only reprogram the counters that actually need it.
>
> Track such counters even if EFER.SVME is cleared, such that if/when
> EFER.SVME is set, KVM can reprogram those counters and enable/disable
> them appropriately. Otherwise, toggling EFER.SVME would need to
> reprogram all counters and use a different code path than
> kvm_pmu_handle_nested_transition().
>
> Signed-off-by: Yosry Ahmed <yosry@xxxxxxxxxx>
> ---
> ...
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index bdbe0456049d0..fb73806d3bfa0 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -248,6 +248,19 @@ static inline bool kvm_pmu_is_fastpath_emulation_allowed(struct kvm_vcpu *vcpu)
> X86_PMC_IDX_MAX);
> }
>
> +static inline void kvm_pmu_handle_nested_transition(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> +
> + if (bitmap_empty(pmu->pmc_needs_nested_reprogram, X86_PMC_IDX_MAX))
> + return;
> +
> + BUILD_BUG_ON(sizeof(pmu->pmc_needs_nested_reprogram) != sizeof(atomic64_t));
> + atomic64_or(*(s64 *)pmu->pmc_needs_nested_reprogram,
> + &vcpu_to_pmu(vcpu)->__reprogram_pmi);
> + kvm_make_request(KVM_REQ_PMU, vcpu);
> +}

In general, this deferral is misguided. The G/H bits should be
re-evaluated before we call kvm_pmu_instruction_retired() for an
emulated instruction.

> ...
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index f1c29ac306917..966e4138308f6 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -9,6 +9,7 @@
> #include "kvm_cache_regs.h"
> #include "kvm_emulate.h"
> #include "cpuid.h"
> +#include "pmu.h"
>
> #define KVM_MAX_MCE_BANKS 32
>
> @@ -152,6 +153,8 @@ static inline void enter_guest_mode(struct kvm_vcpu *vcpu)
> {
> vcpu->arch.hflags |= HF_GUEST_MASK;
> vcpu->stat.guest_mode = 1;
> +
> + kvm_pmu_handle_nested_transition(vcpu);
> }

This happens too late for VMRUN, since we have already called
kvm_pmu_instruction_retired() via kvm_skip_emulated_instruction(), and
VMRUN counts as a *guest* instruction.