Re: [PATCH v2 3/6] KVM: x86: selftests: Set up AMD VM in pmu_counters_test

From: Sean Christopherson
Date: Wed Jan 08 2025 - 13:36:46 EST


On Wed, Sep 18, 2024, Colton Lewis wrote:
> Branch in main() depending on if the CPU is Intel or AMD. They are
> subject to vastly different requirements because the AMD PMU lacks
> many properties defined by the Intel PMU including the entire CPUID
> 0xa function where Intel stores all the PMU properties. AMD lacks this
> as well as any consistent notion of PMU versions as Intel does. Every
> feature is a separate flag and they aren't the same features as Intel.
>
> Set up a VM for testing core AMD counters and ensure proper CPUID
> features are set.
>
> Signed-off-by: Colton Lewis <coltonlewis@xxxxxxxxxx>
> ---
> .../selftests/kvm/x86_64/pmu_counters_test.c | 104 ++++++++++++++----
> 1 file changed, 83 insertions(+), 21 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> index 0e305e43a93b..5b240585edc5 100644
> --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> @@ -30,10 +30,21 @@
> #define NUM_INSNS_RETIRED (NUM_LOOPS * NUM_INSNS_PER_LOOP + NUM_EXTRA_INSNS)
>
>
> +/*
> + * Limit testing to MSRs that are actually defined by Intel (in the SDM). MSRs
> + * that aren't defined counter MSRs *probably* don't exist, but there's no
> + * guarantee that currently undefined MSR indices won't be used for something
> + * other than PMCs in the future.
> + */
> +#define MAX_NR_GP_COUNTERS 8
> +#define MAX_NR_FIXED_COUNTERS 3
> +#define AMD_NR_CORE_COUNTERS 4
> +#define AMD_NR_CORE_EXT_COUNTERS 6
> +
> static uint8_t kvm_pmu_version;
> static bool kvm_has_perf_caps;
>
> -static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
> +static struct kvm_vm *intel_pmu_vm_create(struct kvm_vcpu **vcpu,
> void *guest_code,

When renaming things, please fixup the alignment as needed. Yes, it's more
churn, but one-time pain is preferable to living indefinitely with funky formatting.

I also don't like renaming just one symbol. E.g. the above MAX_NR_GP_COUNTERS
and MAX_NR_FIXED_COUNTERS #defines are Intel specific, but that's not at all
clear from the code. Ditto for guest_rd_wr_counters() vs.
guest_test_rdwr_core_counters().

Given how little code is actually shared between Intel and AMD, I think it makes
sense to have the bulk of the code live in separate .c files. Since
tools/testing/selftests/kvm/lib/x86/pmu.c is already a thing, the best option is
probably to rename pmu_counters_test.c to intel_pmu_counters_test.c, and then
extract the common bits to lib/x86/pmu.c (or include/x86/pmu.h as appropriate).

> uint8_t pmu_version,
> uint64_t perf_capabilities)
> +static void test_core_counters(void)
> +{
> + uint8_t nr_counters = nr_core_counters();
> + bool core_ext = kvm_cpu_has(X86_FEATURE_PERF_CTR_EXT_CORE);
> + bool perfmon_v2 = kvm_cpu_has(X86_FEATURE_PERFMON_V2);
> + struct kvm_vcpu *vcpu;
> + struct kvm_vm *vm;
> +
> + for (uint8_t ce = 0; ce <= core_ext; ce++) {

Kernel style is to not declared variables inside for-loops.

> + for (uint8_t pm = 0; pm <= perfmon_v2; pm++) {

Iterating over booleans is decidedly odd, the indentation levels are painful and
will only get worse as more features are added, and the "ce" and "pm" variables
aren't all that intuitive. More below.

> + for (uint8_t nc = 0; nc <= nr_counters; nc++) {

I also find "nc" to be unintuitive. Either use a fully descriptive name, or
make it obvious that the variables is an iterator. E.g. either

uint8_t max_nr_counters = nr_core_counters();

...

for (nr_counters = 0; nr_counters < max_nr_counters; nr_counters++) {


or

for (j = 0; j < nr_counters; j++) {


'j' is obviously not descriptive, but when reading the usage, it's more obvious
that it's a loop iterator (if you choose

> + vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters);
> +
> + if (nc)

Is '0' not a legal number of counters?

> + vcpu_set_cpuid_property(

Google3! (Never, ever wrap immediately after the opening paranethesis).

> + vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE, nc);
> + if (ce)
> + vcpu_set_cpuid_feature(
> + vcpu, X86_FEATURE_PERF_CTR_EXT_CORE);

This likely doesn't do what you want. By default, vm_arch_vcpu_add() initializes
CPUID to KVM's supported CPUID. So only _setting_ the feature means that that
the test is likely only ever running with the full set of supported features.

Jumping back to my complaints with the for-loops, if the features to interate on
are collected in an array, then the test can generate a mask of all possible
combinations and iterate over that (plus the array itself). That keeps the
indentation bounded and eliminates the copy+paste needed to add a new feature.
The only downside is that the test is limited to 64 features, but we'll run into
run time issues long before that limit is reached.

const struct kvm_x86_cpu_feature pmu_features[] = {
X86_FEATURE_PERF_CTR_EXT_CORE,
X86_FEATURE_PERFMON_V2,
};

const u64 pmu_features_mask = BIT_ULL(ARRAY_SIZE(pmu_features)) - 1;

for (mask = 0; mask <= pmu_features_mask; mask++) {
for (nr_counters = 0; nr_counters < max_nr_counters; nr_counters++) {
vm = vm_create_with_one_vcpu(&vcpu, guest_test_core_counters);

vcpu_set_cpuid_property(vcpu, X86_PROPERTY_NUM_PERF_CTR_CORE,
nr_counters);

/* Comment goes here */
for (i = 0; i < ARRAY_SIZE(pmu_features); i++)
vcpu_set_or_clear_cpuid_feature(vcpu, pmu_features[i],
mask & BIT_ULL(i));

...
}

> + if (pm)
> + vcpu_set_cpuid_feature(
> + vcpu, X86_FEATURE_PERFMON_V2);
> +
> + pr_info("Testing core counters: CoreExt = %u, PerfMonV2 = %u, NumCounters = %u\n",
> + ce, pm, nc);
> + run_vcpu(vcpu);
> +
> + kvm_vm_free(vm);
> + }
> + }
> + }
> +}