Re: [PATCH v2 40/49] KVM: x86: Initialize guest cpu_caps based on KVM support

From: Sean Christopherson
Date: Mon Jul 08 2024 - 20:10:21 EST


On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > @@ -421,6 +423,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > */
> > for (i = 0; i < NR_KVM_CPU_CAPS; i++) {
> > const struct cpuid_reg cpuid = reverse_cpuid[i];
> > + struct kvm_cpuid_entry2 emulated;
> >
> > if (!cpuid.function)
> > continue;
> > @@ -429,7 +432,16 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> > if (!entry)
> > continue;
> >
> > - vcpu->arch.cpu_caps[i] = cpuid_get_reg_unsafe(entry, cpuid.reg);
> > + cpuid_func_emulated(&emulated, cpuid.function);
> > +
> > + /*
> > + * A vCPU has a feature if it's supported by KVM and is enabled
> > + * in guest CPUID. Note, this includes features that are
> > + * supported by KVM but aren't advertised to userspace!
> > + */
> > + vcpu->arch.cpu_caps[i] = kvm_cpu_caps[i] | kvm_vmm_cpu_caps[i] |
> > + cpuid_get_reg_unsafe(&emulated, cpuid.reg);
> > + vcpu->arch.cpu_caps[i] &= cpuid_get_reg_unsafe(entry, cpuid.reg);
>
> Hi,
>
> I have an idea. What if we get rid of kvm_vmm_cpu_caps, and instead advertise the
> MWAIT in KVM_GET_EMULATED_CPUID?
>
> MWAIT is sort of emulated as NOP after all, plus features in KVM_GET_EMULATED_CPUID are
> sort of 'emulated inefficiently' and you can say that NOP is an inefficient emulation
> of MWAIT sort of.

Heh, sort of indeed. I really don't want to advertise MWAIT to userspace in any
capacity beyond KVM_CAP_X86_DISABLE_EXITS, because advertising MWAIT to VMs when
MONITOR/MWAIT exiting is enabled is actively harmful, to both host and guest.

KVM also doesn't emulate them on #UD, unlike MOVBE, which would make the API even
more confusing than it already is.

> It just feels to me that kvm_vmm_cpu_caps, is somewhat an overkill, and its name is
> somewhat confusing.

Yeah, I don't love it either, but trying to handle MWAIT as a one-off was even
uglier. One option would be to piggyback cpuid_func_emulated(), but add a param
to have it fill MWAIT only for KVM's internal purposes. That'd essentially be
the same as a one-off in kvm_vcpu_after_set_cpuid(), but less ugly.

I'd say it comes down to whether or not we expect to have more features that KVM
"supports", but doesn't advertise to userspace. If we do, then I think adding
VMM_F() is the way to go. If we expect MWAIT to be the only feature that gets
this treatment, then I'm ok if we bastardize cpuid_func_emulated().

And I think/hope that MWAIT will be a one-off. Emulating it as a nop was a
mistake and has since been quirked, and I like to think we (eventually) learn
from our mistakes.

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0e64a6332052..dbc3f6ce9203 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -448,7 +448,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
if (!entry)
continue;

- cpuid_func_emulated(&emulated, cpuid.function);
+ cpuid_func_emulated(&emulated, cpuid.function, false);

/*
* A vCPU has a feature if it's supported by KVM and is enabled
@@ -1034,7 +1034,8 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
return entry;
}

-static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func)
+static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func,
+ bool only_advertised)
{
memset(entry, 0, sizeof(*entry));

@@ -1048,6 +1049,9 @@ static int cpuid_func_emulated(struct kvm_cpuid_entry2 *entry, u32 func)
return 1;
case 1:
entry->ecx = F(MOVBE);
+ /* comment goes here. */
+ if (!only_advertised)
+ entry->ecx |= F(MWAIT);
return 1;
case 7:
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
@@ -1065,7 +1069,7 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
if (array->nent >= array->maxnent)
return -E2BIG;

- array->nent += cpuid_func_emulated(&array->entries[array->nent], func);
+ array->nent += cpuid_func_emulated(&array->entries[array->nent], func, true);
return 0;
}