Re: [PATCH v2 26/49] KVM: x86: Add a macro to init CPUID features that KVM emulates in software
From: Maxim Levitsky
Date: Wed Jul 24 2024 - 13:59:13 EST
On Mon, 2024-07-08 at 15:30 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > > Now that kvm_cpu_cap_init() is a macro with its own scope, add EMUL_F() to
> > > OR-in features that KVM emulates in software, i.e. that don't depend on
> > > the feature being available in hardware. The contained scope
> > > of kvm_cpu_cap_init() allows using a local variable to track the set of
> > > emulated leaves, which in addition to avoiding confusing and/or
> > > unnecessary variables, helps prevent misuse of EMUL_F().
> > >
> > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > > ---
> > > arch/x86/kvm/cpuid.c | 36 +++++++++++++++++++++---------------
> > > 1 file changed, 21 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > index 1064e4d68718..33e3e77de1b7 100644
> > > --- a/arch/x86/kvm/cpuid.c
> > > +++ b/arch/x86/kvm/cpuid.c
> > > @@ -94,6 +94,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
> > > F(name); \
> > > })
> > >
> > > +/*
> > > + * Emulated Feature - For features that KVM emulates in software irrespective
> > > + * of host CPU/kernel support.
> > > + */
> > > +#define EMUL_F(name) \
> > > +({ \
> > > + kvm_cpu_cap_emulated |= F(name); \
> > > + F(name); \
> > > +})
> >
> > To me it feels more and more that this patch series doesn't go into the right
> > direction.
> >
> > How about we just abandon the whole concept of masks and instead just have a
> > list of statements
> >
> > Pretty much the opposite of the patch series I confess:
>
> FWIW, I think it's actually largely the same code under the hood. The code for
> each concept/flavor ends up being very similar, it's mostly just handling the
> bitwise-OR in the callers vs. in the helpers.
>
> > #define CAP_PASSTHOUGH 0x01
> > #define CAP_EMULATED 0x02
> > #define CAP_AMD_ALIASED 0x04 // for AMD aliased features
> > #define CAP_SCATTERED 0x08
> > #define CAP_X86_64 0x10 // supported only on 64 bit hypervisors
> > ...
> >
> >
> > /* CPUID_1_ECX*/
> >
> > /* TMA is not passed though because: xyz*/
> > kvm_cpu_cap_init(TMA, 0);
> >
> > kvm_cpu_cap_init(SSSE3, CAP_PASSTHOUGH);
> > /* CNXT_ID is not passed though because: xyz*/
> > kvm_cpu_cap_init(CNXT_ID, 0);
> > kvm_cpu_cap_init(RESERVED, 0);
> > kvm_cpu_cap_init(FMA, CAP_PASSTHOUGH);
> > ...
> > /* KVM always emulates TSC_ADJUST */
> > kvm_cpu_cap_init(TSC_ADJUST, CAP_EMULATED | CAP_SCATTERED);
> >
> > ...
> >
> > /* CPUID_D_1_EAX*/
> > /* XFD is disabled on 32 bit systems because: xyz*/
> > kvm_cpu_cap_init(XFD, CAP_PASSTHOUGH | CAP_X86_64)
> >
> >
> > 'kvm_cpu_cap_init' can be a macro if needed to have the compile checks.
> >
> > There are several advantages to this:
> >
> > - more readability, plus if needed each statement can be amended with a comment.
> > - No weird hacks in 'F*' macros, which additionally eventually evaluate into a bit,
> > which is confusing.
> > In fact no need to even have them at all.
> > - No need to verify that bitmask belongs to a feature word.
>
> Yes, but the downside is that there is no enforcement of features in a word being
> bundled together.
As I explained earlier, this is not an issue in principle, even if the caps are not
grouped together, the code will still work just fine.
kvm_cpu_cap_init_begin(CPUID_1_ECX);
/* TMA is not passed though because: xyz*/
kvm_cpu_cap_init(TMA, 0);
kvm_cpu_cap_init(SSSE3, CAP_PASSTHOUGH);
/* CNXT_ID is not passed though because: xyz*/
kvm_cpu_cap_init(CNXT_ID, 0);
kvm_cpu_cap_init(RESERVED, 0);
kvm_cpu_cap_init(FMA, CAP_PASSTHOUGH);
...
/* KVM always emulates TSC_ADJUST */
kvm_cpu_cap_init(TSC_ADJUST, CAP_EMULATED | CAP_SCATTERED);
kvm_cpu_cap_init_end(CPUID_1_ECX);
...
...
And kvm_cpu_cap_init_begin, can set some cap_in_progress variable.
>
> > - Merge friendly - each capability has its own line.
>
> That's almost entirely convention though. Other than inertia, nothing is stopping
> us from doing:
>
> kvm_cpu_cap_init(CPUID_12_EAX,
> SF(SGX1) |
> SF(SGX2) |
> SF(SGX_EDECCSSA)
That trivial change is already an improvement, although it still leaves the problem
of thinking that this is one bit 'or', which was reasonable before this patch series,
because it was indeed one big 'or' but now there is lots of things going on behind
the scenes and that violates the principle of the least surprise.
My suggestion fixes this, because when the user sees a series of function calls,
and nobody will assume anything about these functions calls in contrast with series
of 'ors'. It's just how I look at it.
> );
>
> I don't see a clean way of avoiding the addition of " |" on the last existing
> line, but in practice I highly doubt that will ever be a source of meaningful pain.
>
> Same goes for the point about adding comments. We could do that with either
> approach, we just don't do so today.
Yes, from the syntax POV there is indeed no problem, and I do agree that putting
each feature on its own line, together with comments for the features that need it
is a win-win improvement over what we have after this patch series.
>
> > Disadvantages:
> >
> > - Longer list - IMHO not a problem, since it is very easy to read / search
> > and can have as much comments as needed.
> > For example this is how the kernel lists the CPUID features and this list IMHO
> > is very manageable.
>
> There's one big difference: KVM would need to have a line for every feature that
> KVM _doesn't_ support.
Could you elaborate on why?
If we zero the whole leaf and then set specific bits there, one bit per kvm_cpu_cap_init.
> For densely populated words, that's not a huge issue,
> but it's problematic for sparsely populated words, e.g. CPUID_12_EAX would have
> 29 reserved/unsupport entries, which IMO ends up being a big net negative for
> code readability and ongoing maintenance.
>
> We could avoid that cost (and the danger of a missed bit) by collecting the set
> of features that have been initialized for each word, and then masking off the
> uninitialized/unsupported at the end. But then we're back to the bitwise-OR and
> mask logic.
>
> And while I agree that having the F*() macros set state _and_ evaulate to a bit
> is imperfect, it does have its advantages. E.g. to avoid evaluating to a value,
> we could have F() modify a local variable that is scoped to kvm_cpu_cap_init(),
> a las kvm_cpu_cap_emulated. But then we'd need explicit code and/or comments
> to call out that VMM_F() and the like intentionally don't set kvm_cpu_cap_supported,
> whereas evualating to a value is a relatively self-documenting "0;".
>
> > - Slower - kvm_set_cpu_caps is called exactly once per KVM module load, thus
> > performance is the last thing I would care about in this function.
> >
> > Another note about this patch: It is somewhat confusing that EMUL_F just
> > forces a feature in kvm caps, regardless of CPU support, because KVM also has
> > KVM_GET_EMULATED_CPUID and it has a different meaning.
>
> Yeah, but IMO that's a problem with KVM_GET_EMULATED_CPUID being poorly defined.
>
> > Users can easily confuse the EMUL_F for something that sets a feature bit in
> > the KVM_GET_EMULATED_CPUID.
>
> I'll see if I can find a good spot for a comment to try and convenient
Best regards,
Maxim Levitsky
>