Re: [PATCH 1/2] KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET"

From: Moger, Babu
Date: Tue Apr 16 2024 - 15:19:43 EST




On 3/8/24 19:36, Sean Christopherson wrote:
> Set the enable bits for general purpose counters in IA32_PERF_GLOBAL_CTRL
> when refreshing the PMU to emulate the MSR's architecturally defined
> post-RESET behavior. Per Intel's SDM:
>
> IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
>
> and
>
> Where "n" is the number of general-purpose counters available in the processor.
>
> AMD also documents this behavior for PerfMonV2 CPUs in one of AMD's many
> PPRs.
>
> Do not set any PERF_GLOBAL_CTRL bits if there are no general purpose
> counters, although a literal reading of the SDM would require the CPU to
> set either bits 63:0 or 31:0. The intent of the behavior is to globally
> enable all GP counters; honor the intent, if not the letter of the law.
>
> Leaving PERF_GLOBAL_CTRL '0' effectively breaks PMU usage in guests that
> haven't been updated to work with PMUs that support PERF_GLOBAL_CTRL.
> This bug was recently exposed when KVM added supported for AMD's
> PerfMonV2, i.e. when KVM started exposing a vPMU with PERF_GLOBAL_CTRL to
> guest software that only knew how to program v1 PMUs (that don't support
> PERF_GLOBAL_CTRL).
>
> Failure to emulate the post-RESET behavior results in such guests
> unknowingly leaving all general purpose counters globally disabled (the
> entire reason the post-RESET value sets the GP counter enable bits is to
> maintain backwards compatibility).
>
> The bug has likely gone unnoticed because PERF_GLOBAL_CTRL has been
> supported on Intel CPUs for as long as KVM has existed, i.e. hardly anyone
> is running guest software that isn't aware of PERF_GLOBAL_CTRL on Intel
> PMUs. And because up until v6.0, KVM _did_ emulate the behavior for Intel
> CPUs, although the old behavior was likely dumb luck.
>
> Because (a) that old code was also broken in its own way (the history of
> this code is a comedy of errors), and (b) PERF_GLOBAL_CTRL was documented
> as having a value of '0' post-RESET in all SDMs before March 2023.
>
> Initial vPMU support in commit f5132b01386b ("KVM: Expose a version 2
> architectural PMU to a guests") *almost* got it right (again likely by
> dumb luck), but for some reason only set the bits if the guest PMU was
> advertised as v1:
>
> if (pmu->version == 1) {
> pmu->global_ctrl = (1 << pmu->nr_arch_gp_counters) - 1;
> return;
> }
>
> Commit f19a0c2c2e6a ("KVM: PMU emulation: GLOBAL_CTRL MSR should be
> enabled on reset") then tried to remedy that goof, presumably because
> guest PMUs were leaving PERF_GLOBAL_CTRL '0', i.e. weren't enabling
> counters.
>
> pmu->global_ctrl = ((1 << pmu->nr_arch_gp_counters) - 1) |
> (((1ull << pmu->nr_arch_fixed_counters) - 1) << X86_PMC_IDX_FIXED);
> pmu->global_ctrl_mask = ~pmu->global_ctrl;
>
> That was KVM's behavior up until commit c49467a45fe0 ("KVM: x86/pmu:
> Don't overwrite the pmu->global_ctrl when refreshing") removed
> *everything*. However, it did so based on the behavior defined by the
> SDM , which at the time stated that "Global Perf Counter Controls" is
> '0' at Power-Up and RESET.
>
> But then the March 2023 SDM (325462-079US), stealthily changed its
> "IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT"
> table to say:
>
> IA32_PERF_GLOBAL_CTRL: Sets bits n-1:0 and clears the upper bits.
>
> Note, kvm_pmu_refresh() can be invoked multiple times, i.e. it's not a
> "pure" RESET flow. But it can only be called prior to the first KVM_RUN,
> i.e. the guest will only ever observe the final value.
>
> Note #2, KVM has always cleared global_ctrl during refresh (see commit
> f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")),
> i.e. there is no danger of breaking existing setups by clobbering a value
> set by userspace.
>
> Reported-by: Babu Moger <babu.moger@xxxxxxx>
> Cc: Sandipan Das <sandipan.das@xxxxxxx>
> Cc: Like Xu <like.xu.linux@xxxxxxxxx>
> Cc: Mingwei Zhang <mizhang@xxxxxxxxxx>
> Cc: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>

Tested-by: Babu Moger <babu.moger@xxxxxxx>

--
Thanks
Babu Moger