Re: [PATCH v2 22/49] KVM: x86: Add a macro to precisely handle aliased 0x1.EDX CPUID features

From: Sean Christopherson
Date: Thu Jul 25 2024 - 14:40:12 EST


On Wed, Jul 24, 2024, Maxim Levitsky wrote:
> On Mon, 2024-07-08 at 14:08 -0700, Sean Christopherson wrote:
> > On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > > On Fri, 2024-05-17 at 10:38 -0700, Sean Christopherson wrote:
> > > > Add a macro to precisely handle CPUID features that AMD duplicated from
> > > > CPUID.0x1.EDX into CPUID.0x8000_0001.EDX. This will allow adding an
> > > > assert that all features passed to kvm_cpu_cap_init() match the word being
> > > > processed, e.g. to prevent passing a feature from CPUID 0x7 to CPUID 0x1.
> > > >
> > > > Because the kernel simply reuses the X86_FEATURE_* definitions from
> > > > CPUID.0x1.EDX, KVM's use of the aliased features would result in false
> > > > positives from such an assert.
> > > >
> > > > No functional change intended.
> > > >
> > > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > > > ---
> > > > arch/x86/kvm/cpuid.c | 24 +++++++++++++++++-------
> > > > 1 file changed, 17 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > > > index 5e3b97d06374..f2bd2f5c4ea3 100644
> > > > --- a/arch/x86/kvm/cpuid.c
> > > > +++ b/arch/x86/kvm/cpuid.c
> > > > @@ -88,6 +88,16 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted)
> > > > F(name); \
> > > > })
> > > >
> > > > +/*
> > > > + * Aliased Features - For features in 0x8000_0001.EDX that are duplicates of
> > > > + * identical 0x1.EDX features, and thus are aliased from 0x1 to 0x8000_0001.
> > > > + */
> > > > +#define AF(name) \
> > > > +({ \
> > > > + BUILD_BUG_ON(__feature_leaf(X86_FEATURE_##name) != CPUID_1_EDX); \
> > > > + feature_bit(name); \
> > > > +})
> > > > +
> > > > /*
> > > > * Magic value used by KVM when querying userspace-provided CPUID entries and
> > > > * doesn't care about the CPIUD index because the index of the function in
> > > > @@ -758,13 +768,13 @@ void kvm_set_cpu_caps(void)
> > > > );
> > > >
> > > > kvm_cpu_cap_init(CPUID_8000_0001_EDX,
> > > > - F(FPU) | F(VME) | F(DE) | F(PSE) |
> > > > - F(TSC) | F(MSR) | F(PAE) | F(MCE) |
> > > > - F(CX8) | F(APIC) | 0 /* Reserved */ | F(SYSCALL) |
> > > > - F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
> > > > - F(PAT) | F(PSE36) | 0 /* Reserved */ |
> > > > - F(NX) | 0 /* Reserved */ | F(MMXEXT) | F(MMX) |
> > > > - F(FXSR) | F(FXSR_OPT) | X86_64_F(GBPAGES) | F(RDTSCP) |
> > > > + AF(FPU) | AF(VME) | AF(DE) | AF(PSE) |
> > > > + AF(TSC) | AF(MSR) | AF(PAE) | AF(MCE) |
> > > > + AF(CX8) | AF(APIC) | 0 /* Reserved */ | F(SYSCALL) |
> > > > + AF(MTRR) | AF(PGE) | AF(MCA) | AF(CMOV) |
> > > > + AF(PAT) | AF(PSE36) | 0 /* Reserved */ |
> > > > + F(NX) | 0 /* Reserved */ | F(MMXEXT) | AF(MMX) |
> > > > + AF(FXSR) | F(FXSR_OPT) | X86_64_F(GBPAGES) | F(RDTSCP) |
> > > > 0 /* Reserved */ | X86_64_F(LM) | F(3DNOWEXT) | F(3DNOW)
> > > > );
> > > >
> > >
> > > Hi,
> > >
> > > What if we defined the aliased features instead.
> > > Something like this:
> > >
> > > #define __X86_FEATURE_8000_0001_ALIAS(feature) \
> > > (feature + (CPUID_8000_0001_EDX - CPUID_1_EDX) * 32)
> > >
> > > #define KVM_X86_FEATURE_FPU_ALIAS __X86_FEATURE_8000_0001_ALIAS(KVM_X86_FEATURE_FPU)
> > > #define KVM_X86_FEATURE_VME_ALIAS __X86_FEATURE_8000_0001_ALIAS(KVM_X86_FEATURE_VME)
> > >
> > > And then just use for example the 'F(FPU_ALIAS)' in the CPUID_8000_0001_EDX
> >
> > At first glance, I really liked this idea, but after working through the
> > ramifications, I think I prefer "converting" the flag when passing it to
> > kvm_cpu_cap_init(). In-place conversion makes it all but impossible for KVM to
> > check the alias, e.g. via guest_cpu_cap_has(), especially since the AF() macro
> > doesn't set the bits in kvm_known_cpu_caps (if/when a non-hacky validation of
> > usage becomes reality).
>
> Could you elaborate on this as well?
>
> My suggestion was that we can just treat aliases as completely independent
> and dummy features, say KVM_X86_FEATURE_FPU_ALIAS, and pass them as is to the
> guest, which means that if an alias is present in host cpuid, it appears in
> kvm caps, and thus qemu can then set it in guest cpuid.
>
> I don't think that we need any special treatment for them if you look at it
> this way. If you don't agree, can you give me an example?

KVM doesn't honor the aliases beyond telling userspace they can be set (see below
for all the aliased features that KVM _should_ be checking). The APM clearly
states that the features are the same as their CPUID.0x1 counterparts, but Intel
CPUs don't support the aliases. So, as you also note below, I think we could
unequivocally say that enumerating the aliases but not the "real" features is a
bogus CPUID model, but we can't say the opposite, i.e. the real features can
exists without the aliases.

And that means that KVM must never query the aliases, e.g. should never do
guest_cpu_cap_has(KVM_X86_FEATURE_FPU_ALIAS), because the result is essentially
meaningless. It's a small thing, but if KVM_X86_FEATURE_FPU_ALIAS simply doesn't
exist, i.e. we do in-place conversion, then it's impossible to feed the aliases
into things like guest_cpu_cap_has().

Heh, on a related topic, __cr4_reserved_bits() fails to account for any of the
aliased features. Unless I'm missing something, VME, DE, TSC, PSE, PAE, PGE and
MCE, all need to be handled in __cr4_reserved_bits(). Amusingly,
nested_vmx_cr_fixed1_bits_update() handles the aliased legacy features. I don't
see any reason for nested_vmx_cr_fixed1_bits_update() to manually query guest
CPUID, it should be able to use cr4_guest_rsvd_bits verbatim.

> > Side topic, if it's not already documented somewhere else, kvm/x86/cpuid.rst
> > should call out that KVM only honors the features in CPUID.0x1, i.e. that setting
> > aliased bits in CPUID.0x8000_0001 is supported if and only if the bit(s) is also
> > set in CPUID.0x1.
>
> To be honest if KVM enforces this, such enforcement can be removed IMHO:

There's no enforcement, and as above I agree that this would be a bogus CPUID
model. I was thinking that it could be helpful to document that KVM never checks
the aliases, but on second though, it's probably unnecessary because the APM does
say

Same as CPUID Fn0000_0001_EDX[...]

for all the bits, i.e. setting the aliases without the real bits is an
architectural violation.