Re: [PATCH v4 11/12] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

From: Sean Christopherson
Date: Fri Apr 07 2023 - 10:45:04 EST


On Fri, Apr 07, 2023, Like Xu wrote:
> On 7/4/2023 9:35 am, Sean Christopherson wrote:
> > On Tue, Feb 14, 2023, Like Xu wrote:
> > > + case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
> > > + if (!msr_info->host_initiated)
> > > + return 0; /* Writes are ignored */
> >
> > Where is the "writes ignored" behavior documented? I can't find anything in the
> > APM that defines write behavior.
>
> KVM would follow the real hardware behavior once specifications stay silent
> on details or secret.

So is that a "this isn't actually documented anywhere" answer? It's not your
responsibility to get AMD to document their CPUs, but I want to clearly document
when KVM's behavior is based solely off of observed hardware behavior, versus an
actual specification.

> How about this:
>
> /*
> * Note, AMD ignores writes to reserved bits and read-only PMU MSRs,
> * whereas Intel generates #GP on attempts to write reserved/RO MSRs.
> */

Looks good.

> > > + pmu->nr_arch_gp_counters = min_t(unsigned int,
> > > + ebx.split.num_core_pmc,
> > > + kvm_pmu_cap.num_counters_gp);
> > > + } else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE)) {
> > > pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
> >
> > This needs to be sanitized, no? E.g. if KVM only has access to 4 counters, but
> > userspace sets X86_FEATURE_PERFCTR_CORE anyways. Hrm, unless I'm missing something,
> > that's a pre-existing bug.
>
> Now your point is that if a user space more capbility than KVM can support,
> KVM should constrain it.
> Your previous preference was that the user space can set capbilities that
> evene if KVM doesn't support as long as it doesn't break KVM and host and the
> guest will eat its own.

Letting userspace define a "bad" configuration is perfectly ok, but KVM needs to
be careful not to endanger itself by consuming the bad state. A good example is
the handling of nested SVM features in svm_vcpu_after_set_cpuid(). KVM lets
userspace define anything and everything, but KVM only actually tries to utilize
a feature if the feature is actually supported in hardware.

In this case, it's not clear to me that putting a bogus value into "nr_arch_gp_counters"
is safe (for KVM). And AIUI, the guest can't actually use more than
kvm_pmu_cap.num_counters_gp counters, i.e. KVM isn't arbitrarily restricting the
setup.