Re: [PATCH v2 19/49] KVM: x86: Add a macro to init CPUID features that ignore host kernel support

From: Sean Christopherson
Date: Mon Jul 08 2024 - 16:53:15 EST


On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > Add a macro for use in kvm_set_cpu_caps() to automagically initialize
> > features that KVM wants to support based solely on the CPU's capabilities,
> > e.g. KVM advertises LA57 support if it's available in hardware, even if
> > the host kernel isn't utilizing 57-bit virtual addresses.
> >
> > Take advantage of the fact that kvm_cpu_cap_mask() adjusts kvm_cpu_caps
> > based on raw CPUID, i.e. will clear features bits that aren't supported in
> > hardware, and simply force-set the capability before applying the mask.
> >
> > Abusing kvm_cpu_cap_set() is a borderline evil shenanigan, but doing so
> > avoid extra CPUID lookups, and a future commit will harden the entire
> > family of *F() macros to assert (at compile time) that every feature being
> > allowed is part of the capability word being processed, i.e. using a macro
> > will bring more advantages in the future.
>
> Could you explain what do you mean by "extra CPUID lookups"?

cpuid_ecx(7) incurs a CPUID to read the raw info, on top of the CPUID that is
executed by kvm_cpu_cap_init() (kvm_cpu_cap_mask() as of this patch). Obviously
not a big deal, but it's an extra VM-Exit when running as a VM.

> > +/*
> > + * Raw Feature - For features that KVM supports based purely on raw host CPUID,
> > + * i.e. that KVM virtualizes even if the host kernel doesn't use the feature.
> > + * Simply force set the feature in KVM's capabilities, raw CPUID support will
> > + * be factored in by kvm_cpu_cap_mask().
> > + */
> > +#define RAW_F(name) \
> > +({ \
> > + kvm_cpu_cap_set(X86_FEATURE_##name); \
> > + F(name); \
> > +})
> > +
> > /*
> > * Magic value used by KVM when querying userspace-provided CPUID entries and
> > * doesn't care about the CPIUD index because the index of the function in
> > @@ -682,15 +694,12 @@ void kvm_set_cpu_caps(void)
> > F(AVX512VL));
> >
> > kvm_cpu_cap_mask(CPUID_7_ECX,
> > - F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > + F(AVX512VBMI) | RAW_F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
> > F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> > F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> > F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> > F(SGX_LC) | F(BUS_LOCK_DETECT)
> > );
> > - /* Set LA57 based on hardware capability. */
> > - if (cpuid_ecx(7) & F(LA57))
> > - kvm_cpu_cap_set(X86_FEATURE_LA57);
> >
> > /*
> > * PKU not yet implemented for shadow paging and requires OSPKE
>
> Putting a function call into a macro which evaluates into a bitmask is
> somewhat misleading, but let it be...
>
> IMHO in long term, it might be better to rip the whole huge 'or'ed mess, and replace
> it with a list of statements, along with comments for all unusual cases.

As in something like this?

kvm_cpu_cap_init(AVX512VBMI);
kvm_cpu_cap_init_raw(LA57);
kvm_cpu_cap_init(PKU);
...
kvm_cpu_cap_init(BUS_LOCK_DETECT);

kvm_cpu_cap_init_aliased(CPUID_8000_0001_EDX, FPU);

...

kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX1);
kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX2);
kvm_cpu_cap_init_scattered(CPUID_12_EAX, SGX_EDECCSSA);

The tricky parts are incorporating raw CPUID into the masking and handling features
that KVM _doesn't_ support. For raw CPUID, we could simply do CPUID every time, or
pre-fill an array to avoid hundreds of CPUIDs that are largely redudant.

But I don't see a way to mask off unsupported features without losing the
compile-time protections that the current code provides. And even if we took a
big hammer approach, e.g. finalized masking for all words at the very end, we'd
still need to carry state across each statement, i.e. we'd still need the bitwise-OR
and mask behavior, it would just be buried in helpers/macros.

I suspect the generated code will be larger, but I doubt that will actually be
problematic. The written code will also be more verbose (roughly 4x since we
tend to squeeze 4 features per line), and it will be harder to ensure initialization
of features in a given word are all co-located.

I definitely don't hate the idea, but I don't think it will be a clear "win" either.
Unless someone feels strongly about pursuing this approach, I'll add to the "things
to explore later" list.