Re: [PATCH V5 05/10] KVM: x86/pmu: Disable vPMU if the minimum num of counters isn't met
From: Jim Mattson
Date: Tue Apr 11 2023 - 10:58:56 EST
On Tue, Apr 11, 2023 at 6:18 AM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
>
> On 11/4/2023 8:58 pm, Jim Mattson wrote:
> > On Mon, Apr 10, 2023 at 11:17 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
> >>
> >> On 11/4/2023 1:36 pm, Jim Mattson wrote:
> >>> On Mon, Apr 10, 2023 at 3:51 AM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
> >>>>
> >>>> From: Like Xu <likexu@xxxxxxxxxxx>
> >>>>
> >>>> Disable PMU support when running on AMD and perf reports fewer than four
> >>>> general purpose counters. All AMD PMUs must define at least four counters
> >>>> due to AMD's legacy architecture hardcoding the number of counters
> >>>> without providing a way to enumerate the number of counters to software,
> >>>> e.g. from AMD's APM:
> >>>>
> >>>> The legacy architecture defines four performance counters (PerfCtrn)
> >>>> and corresponding event-select registers (PerfEvtSeln).
> >>>>
> >>>> Virtualizing fewer than four counters can lead to guest instability as
> >>>> software expects four counters to be available.
> >>>
> >>> I'm confused. Isn't zero less than four?
> >>
> >> As I understand it, you are saying that virtualization of zero counter is also
> >> reasonable.
> >> If so, the above statement could be refined as:
> >>
> >> Virtualizing fewer than four counters when vPMU is enabled may lead to guest
> >> instability
> >> as software expects at least four counters to be available, thus the vPMU is
> >> disabled if the
> >> minimum number of KVM supported counters is not reached during initialization.
> >>
> >> Jim, does this help you or could you explain more about your confusion ?
> >
> > You say that "fewer than four counters can lead to guest instability
> > as software expects four counters to be available." Your solution is
> > to disable the PMU, which leaves zero counters available. Zero is less
> > than four. Hence, by your claim, disabling the PMU can lead to guest
> > instability. I don't see how this is an improvement over one, two, or
> > three counters.
>
> As you know, AMD pmu lacks an architected method (such as CPUID) to
> indicate that the VM does not have any pmu counters available for the
> current platform. Guests like Linux tend to check if their first counters
> exist and work properly to infer that other pmu counters exist.
"Guests like Linux," or just Linux? What do you mean by "tend"? When
do they perform this check, and when do they not?
> If KVM chooses to emulate greater than 1 less than 4 counters, then the
> AMD guest PMU agent may assume that there are legacy 4 counters all
> present (it's what the APM specifies), which requires the legacy code
> to add #GP error handling for counters that should exist but actually not.
I would argue that regardless of the number of counters emulated, a
guest PMU agent may assume that the 4 legacy counters are present,
since that's what the APM specifies.
> So at Sean's suggestion, we took a conservative approach. If KVM detects
> less than 4 counters, we think KVM (under the current configuration and
> platform) is not capable of emulating the most basic AMD pmu capability.
> A large number of legacy instances are ready for 0 or 4+ ctrs, not 2 or 3
Which specific guest operating systems is this change intended for?
> Does this help you ? I wouldn't mind a better move.
Which AMD platforms have less than 4 counters available?
>
> >
> >>>
> >>>> Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> >>>> Signed-off-by: Like Xu <likexu@xxxxxxxxxxx>
> >>>> ---
> >>>> arch/x86/kvm/pmu.h | 3 +++
> >>>> 1 file changed, 3 insertions(+)
> >>>>
> >>>> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> >>>> index dd7c7d4ffe3b..002b527360f4 100644
> >>>> --- a/arch/x86/kvm/pmu.h
> >>>> +++ b/arch/x86/kvm/pmu.h
> >>>> @@ -182,6 +182,9 @@ static inline void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops)
> >>>> enable_pmu = false;
> >>>> }
> >>>>
> >>>> + if (!is_intel && kvm_pmu_cap.num_counters_gp < AMD64_NUM_COUNTERS)
Does this actually guarantee that the requisite number of counters are
available and will always be available while the guest is running?
What happens if some other client of the host perf subsystem requests
a CPU-pinned counter after this checck?
> >>>> + enable_pmu = false;
> >>>> +
> >>>> if (!enable_pmu) {
> >>>> memset(&kvm_pmu_cap, 0, sizeof(kvm_pmu_cap));
> >>>> return;
> >>>> --
> >>>> 2.40.0
> >>>>