Re: [PATCH v2 26/49] KVM: x86: Add a macro to init CPUID features that KVM emulates in software

From: Sean Christopherson
Date: Mon Jul 08 2024 - 18:31:15 EST


On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
> > OR-in features that KVM emulates in software, i.e. that don't depend on
> > the feature being available in hardware. The contained scope
> > of kvm_cpu_cap_init() allows using a local variable to track the set of
> > emulated leaves, which in addition to avoiding confusing and/or
> > unnecessary variables, helps prevent misuse of EMUL_F().
> >
> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
> > 1 file changed, 21 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 1064e4d68718..33e3e77de1b7 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -94,6 +94,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
> > F(name); \
> > })
> >
> > +/*
> > + * Emulated Feature - For features that KVM emulates in software irrespective
> > + * of host CPU/kernel support.
> > + */
> > +#define EMUL_F(name) \
> > +({ \
> > + kvm_cpu_cap_emulated |= F(name); \
> > + F(name); \
> > +})
>
> To me it feels more and more that this patch series doesn't go into the right
> direction.
>
> How about we just abandon the whole concept of masks and instead just have a
> list of statements
>
> Pretty much the opposite of the patch series I confess:

FWIW, I think it's actually largely the same code under the hood. The code for
each concept/flavor ends up being very similar, it's mostly just handling the
bitwise-OR in the callers vs. in the helpers.

> #define CAP_PASSTHOUGH 0x01
> #define CAP_EMULATED 0x02
> #define CAP_AMD_ALIASED 0x04 // for AMD aliased features
> #define CAP_SCATTERED 0x08
> #define CAP_X86_64 0x10 // supported only on 64 bit hypervisors
> ...
>
>
> /* CPUID_1_ECX*/
>
> /* TMA is not passed though because: xyz*/
> kvm_cpu_cap_init(TMA, 0);
>
> kvm_cpu_cap_init(SSSE3, CAP_PASSTHOUGH);
> /* CNXT_ID is not passed though because: xyz*/
> kvm_cpu_cap_init(CNXT_ID, 0);
> kvm_cpu_cap_init(RESERVED, 0);
> kvm_cpu_cap_init(FMA, CAP_PASSTHOUGH);
> ...
> /* KVM always emulates TSC_ADJUST */
> kvm_cpu_cap_init(TSC_ADJUST, CAP_EMULATED | CAP_SCATTERED);
>
> ...
>
> /* CPUID_D_1_EAX*/
> /* XFD is disabled on 32 bit systems because: xyz*/
> kvm_cpu_cap_init(XFD, CAP_PASSTHOUGH | CAP_X86_64)
>
>
> 'kvm_cpu_cap_init' can be a macro if needed to have the compile checks.
>
> There are several advantages to this:
>
> - more readability, plus if needed each statement can be amended with a comment.
> - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
> which is confusing.
> In fact no need to even have them at all.
> - No need to verify that bitmask belongs to a feature word.

Yes, but the downside is that there is no enforcement of features in a word being
bundled together.

> - Merge friendly - each capability has its own line.

That's almost entirely convention though. Other than inertia, nothing is stopping
us from doing:

kvm_cpu_cap_init(CPUID_12_EAX,
SF(SGX1) |
SF(SGX2) |
SF(SGX_EDECCSSA)
);

I don't see a clean way of avoiding the addition of " |" on the last existing
line, but in practice I highly doubt that will ever be a source of meaningful pain.

Same goes for the point about adding comments. We could do that with either
approach, we just don't do so today.

> Disadvantages:
>
> - Longer list - IMHO not a problem, since it is very easy to read / search
> and can have as much comments as needed.
> For example this is how the kernel lists the CPUID features and this list IMHO
> is very manageable.

There's one big difference: KVM would need to have a line for every feature that
KVM _doesn't_ support. For densely populated words, that's not a huge issue,
but it's problematic for sparsely populated words, e.g. CPUID_12_EAX would have
29 reserved/unsupport entries, which IMO ends up being a big net negative for
code readability and ongoing maintenance.

We could avoid that cost (and the danger of a missed bit) by collecting the set
of features that have been initialized for each word, and then masking off the
uninitialized/unsupported at the end. But then we're back to the bitwise-OR and
mask logic.

And while I agree that having the F*() macros set state _and_ evaulate to a bit
is imperfect, it does have its advantages. E.g. to avoid evaluating to a value,
we could have F() modify a local variable that is scoped to kvm_cpu_cap_init(),
a las kvm_cpu_cap_emulated. But then we'd need explicit code and/or comments
to call out that VMM_F() and the like intentionally don't set kvm_cpu_cap_supported,
whereas evualating to a value is a relatively self-documenting "0;".

> - Slower - kvm_set_cpu_caps is called exactly once per KVM module load, thus
> performance is the last thing I would care about in this function.
>
> Another note about this patch: It is somewhat confusing that EMUL_F just
> forces a feature in kvm caps, regardless of CPU support, because KVM also has
> KVM_GET_EMULATED_CPUID and it has a different meaning.

Yeah, but IMO that's a problem with KVM_GET_EMULATED_CPUID being poorly defined.

> Users can easily confuse the EMUL_F for something that sets a feature bit in
> the KVM_GET_EMULATED_CPUID.

I'll see if I can find a good spot for a comment to try and convenient