Re: [PATCH v1] KVM/x86/vPMU: Guest PMI Optimization

From: Andi Kleen
Date: Fri Oct 12 2018 - 12:31:01 EST


> 4. Results
> - Without this optimization, the guest pmi handling time is
> ~4500000 ns, and the max sampling rate is reduced to 250.
> - With this optimization, the guest pmi handling time is ~9000 ns
> (i.e. 1 / 500 of the non-optimization case), and the max sampling
> rate remains at the original 100000.

Impressive performance improvement!

It's not clear to me why you're special casing PMIs here. The optimization
should work generically, right?

perf will enable/disable the PMU even outside PMIs, e.g. on context
switches, which is a very important path too.

> @@ -237,9 +267,23 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> default:
> if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> (pmc = get_fixed_pmc(pmu, msr))) {
> - if (!msr_info->host_initiated)
> - data = (s64)(s32)data;
> - pmc->counter += data - pmc_read_counter(pmc);
> + if (pmu->in_pmi) {
> + /*
> + * Since we are not re-allocating a perf event
> + * to reconfigure the sampling time when the
> + * guest pmu is in PMI, just set the value to
> + * the hardware perf counter. Counting will
> + * continue after the guest enables the
> + * counter bit in MSR_CORE_PERF_GLOBAL_CTRL.
> + */
> + struct hw_perf_event *hwc =
> + &pmc->perf_event->hw;
> + wrmsrl(hwc->event_base, data);

Is that guaranteed to be always called on the right CPU that will run the vcpu?

AFAIK there's an ioctl to set MSRs in the guest from qemu, I'm pretty sure
it won't handle that.

May need to be delayed to entry time.

-Andi