Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

From: Sean Christopherson
Date: Thu Oct 26 2023 - 16:54:10 EST


On Thu, Oct 26, 2023, Mingwei Zhang wrote:
> > +static bool pmu_is_intel_event_stable(uint8_t idx)
> > +{
> > + switch (idx) {
> > + case INTEL_ARCH_CPU_CYCLES:
> > + case INTEL_ARCH_INSTRUCTIONS_RETIRED:
> > + case INTEL_ARCH_REFERENCE_CYCLES:
> > + case INTEL_ARCH_BRANCHES_RETIRED:
> > + return true;
> > + default:
> > + return false;
> > + }
> > +}
>
> Brief explanation on why other events are not stable please. Since there
> are only a few architecture events, maybe listing all of them with
> explanation in comments would work better.

Heh, I've already rewritten this logic to make


> > +
> > +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
> > + uint32_t counter_msr, uint32_t nr_gp_counters)
> > +{
> > + uint8_t idx = event.f.bit;
> > + unsigned int i;
> > +
> > + for (i = 0; i < nr_gp_counters; i++) {
> > + wrmsr(counter_msr + i, 0);
> > + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > + ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
> > + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
>
> Some comment might be needed for readability. Abuptly inserting inline
> assembly code in C destroys the readability.
>
> I wonder do we need add 'clobber' here for the above line, since it
> takes away ecx?

It's already there. You can't directly clobber a register that is used as an
input constraint. The workaround is to make the register both an input and an
output, hense the "+c" in the outputs section instead of just "c" in the inputs
section. The extra bit of cleverness is to use an intermediate anonymous variable
so that NUM_BRANCHES can effectively be passed in (#defines won't work as output
constraints).

> Also, I wonder if we need to disable IRQ here? This code might be
> intercepted and resumed. If so, then the test will get a different
> number?

This is guest code, disabling IRQs is pointless. There are no guest virtual IRQs,
guarding aginst host IRQs is impossible, unnecessary, and actualy undesirable,
i.e. the guest vPMU shouldn't be counting host instructions and whatnot.

> > +
> > + if (pmu_is_intel_event_stable(idx))
> > + GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
>
> Okay, just the counter value is non-zero means we pass the test ?!

FWIW, I've updated

> hmm, I wonder other than IRQ stuff, what else may affect the result? NMI
> watchdog or what?

This is the beauty of selftests. There _so_ simple that there are very few
surprises. E.g. there are no events of any kind unless the test explicitly
generates them. The downside is that doing anything complex in selftests requires
writing a fair bit of code.

> > +
> > + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > + !ARCH_PERFMON_EVENTSEL_ENABLE |
> > + intel_pmu_arch_events[idx]);
> > + wrmsr(counter_msr + i, 0);
> > + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> ditto for readability. Please consider using a macro to avoid repeated
> explanation.

Heh, already did this too. Though I'm not entirely sure it's more readable. It's
definitely more precise and featured :-)

#define GUEST_MEASURE_EVENT(_msr, _value, clflush, FEP) \
do { \
__asm__ __volatile__("wrmsr\n\t" \
clflush "\n\t" \
"mfence\n\t" \
"1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t" \
FEP "loop .\n\t" \
FEP "mov %%edi, %%ecx\n\t" \
FEP "xor %%eax, %%eax\n\t" \
FEP "xor %%edx, %%edx\n\t" \
"wrmsr\n\t" \
: "+c"((int){_msr}) \
: "a"((uint32_t)_value), "d"(_value >> 32), \
"D"(_msr) \
); \
} while (0)


> > +int main(int argc, char *argv[])
> > +{
> > + TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> > +
> > + TEST_REQUIRE(host_cpu_is_intel);
> > + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> > + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> > + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
>
> hmm, this means we cannot run this in nested if X86_FEATURE_PDCM is
> missing. It only affects full-width counter, right?

Ah, yeah, good call. It won't be too much trouble to have the test play nice
with !PDCM.