Re: [PATCH] KVM: x86: Provide a capability to disable APERF/MPERF read intercepts

From: Paolo Bonzini
Date: Fri Mar 14 2025 - 11:07:20 EST


On 3/14/25 14:59, Sean Christopherson wrote:
On Thu, Mar 13, 2025, Jim Mattson wrote:
On Mon, Feb 24, 2025 at 4:47 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:

Allow a guest to read the physical IA32_APERF and IA32_MPERF MSRs
without interception.

The IA32_APERF and IA32_MPERF MSRs are not virtualized. Writes are not
handled at all. The MSR values are not zeroed on vCPU creation, saved
on suspend, or restored on resume. No accommodation is made for
processor migration or for sharing a logical processor with other
tasks. No adjustments are made for non-unit TSC multipliers. The MSRs
do not account for time the same way as the comparable PMU events,
whether the PMU is virtualized by the traditional emulation method or
the new mediated pass-through approach.

Nonetheless, in a properly constrained environment, this capability
can be combined with a guest CPUID table that advertises support for
CPUID.6:ECX.APERFMPERF[bit 0] to induce a Linux guest to report the
effective physical CPU frequency in /proc/cpuinfo. Moreover, there is
no performance cost for this capability.

Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx>
---

...

Any thoughts?

It's absolutely absurd, but I like it. I would much rather provide functionality
that is flawed in obvious ways, as opposed to functionality that is flawed in
subtle and hard-to-grok ways. Especially when the former is orders of magnitude
less complex.

I have no objections, so long as we add very explicit disclaimers in the docs.

FWIW, the only reason my response was delayed is because I was trying to figure
out if there's a clean way to avoid adding a large number of a capabilities for
things like this.

True but it's not even a capability, it's just a new bit in the existing KVM_CAP_X86_DISABLE_EXITS.

Just one question:

- u64 r = KVM_X86_DISABLE_EXITS_PAUSE;
+ u64 r = KVM_X86_DISABLE_EXITS_PAUSE | KVM_X86_DISABLE_EXITS_APERFMPERF;

Should it be conditional on the host having the APERFMPERF feature itself? As is the patch _does_ do something sensible, i.e. #GP, but this puts the burden on userspace of checking the host CPUID and figuring out whether it makes sense to expose the feature to the guest. It would be simpler for userspace to be able to say "if the bit is there then enable it and make it visible through CPUID".

Paolo