Re: [PATCH v2] KVM: x86: Add dedicated helper to get CPUID entry with significant index

From: Sean Christopherson
Date: Tue Jul 12 2022 - 12:36:45 EST


On Tue, Jul 12, 2022, Maxim Levitsky wrote:
> On Tue, 2022-07-12 at 00:06 +0000, Sean Christopherson wrote:
> >  static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
> > -       struct kvm_cpuid_entry2 *entries, int nent, u32 function, u32 index)
> > +       struct kvm_cpuid_entry2 *entries, int nent, u32 function, u64 index)
> How I wish that this would be just called EAX and ECX... Anyway....

Heh, I strongly disagree. EAX and ECX are how the CPUID instruction specifies
the function and index, CPUID the lookup itself operates on function+index,
e.g. there are plenty of situations where KVM queries CPUID info without the
inputs coming from EAX/ECX.

> >  {
> >         struct kvm_cpuid_entry2 *e;
> >         int i;
> > @@ -77,9 +85,22 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
> >         for (i = 0; i < nent; i++) {
> >                 e = &entries[i];
> >  
> > -               if (e->function == function &&
> > -                   (!(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) || e->index == index))
> > +               if (e->function != function)
> > +                       continue;
> > +
> > +               /*
> > +                * If the index isn't significant, use the first entry with a
> > +                * matching function.  It's userspace's responsibilty to not
> > +                * provide "duplicate" entries in all cases.
> > +                */
> > +               if (!(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX) || e->index == index)
> >                         return e;
> > +
> > +               /*
> > +                * Function matches and index is significant; not specifying an
> > +                * exact index in this case is a KVM bug.
> > +                */
> Nitpick: Why KVM bug? Bad userspace can also provide a index-significant entry for cpuid
> leaf for which index is not significant in the x86 spec.

Ugh, you're right.

> We could arrange a table of all known leaves and for each leaf if it has an index
> in the x86 spec, and warn/reject the userspace CPUID info if it doesn't match.

We have such a table, cpuid_function_is_indexed(). The alternative would be to
do:

WARN_ON_ONCE(index == KVM_CPUID_INDEX_NOT_SIGNIFICANT &&
cpuid_function_is_indexed(function));

The problem with rejecting userspace CPUID on mismatch is that it could break
userspace :-/ Of course, this entire patch would also break userspace to some
extent, e.g. if userspace is relying on an exact match on index==0. The only
difference being the guest lookups with an exact index would still work.

I think the restriction we could put in place would be that userspace can make
a leaf more relaxed, e.g. to play nice if userspace forgets to set the SIGNFICANT
flag, but rejects attempts to make guest CPUID more restrictive, i.e. disallow
setting the SIGNFICANT flag on leafs that KVM doesn't enumerate as significant.

> > +               WARN_ON_ONCE(index == KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> >         }
> >  
> >         return NULL;
> > @@ -96,7 +117,8 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
> >          * The existing code assumes virtual address is 48-bit or 57-bit in the
> >          * canonical address checks; exit if it is ever changed.
> >          */
> > -       best = cpuid_entry2_find(entries, nent, 0x80000008, 0);
> > +       best = cpuid_entry2_find(entries, nent, 0x80000008,
> > +                                KVM_CPUID_INDEX_NOT_SIGNIFICANT);
> OK.

Thanks for looking through all these!

> >  static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcpu)
> > @@ -219,7 +242,7 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e
> >         struct kvm_cpuid_entry2 *best;
> >         u64 guest_supported_xcr0 = cpuid_get_supported_xcr0(entries, nent);
> >  
> > -       best = cpuid_entry2_find(entries, nent, 1, 0);
> > +       best = cpuid_entry2_find(entries, nent, 1, KVM_CPUID_INDEX_NOT_SIGNIFICANT);
>
> Leaf 1, no index indeed.
>
> Offtopic: I wonder why we call this 'best'?

Awful, awful historic code. IIRC, for functions whose index is not significant,
KVM would iterate over all entries and look for an exact function+index match
anyways. If there was at least one partial match (function match only) but no
full match, KVM would use the first partial match, which it called the "best" match.

We've been slowly/opportunistically killing off the "best" terminology.

> > -struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu,
> > -                                             u32 function, u32 index)
> > +struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu,
> > +                                                   u32 function, u32 index)
> Nitpick: could you fix the indention while at it?

The indentation is correct, it's only the diff that appears misaligned.

> > @@ -1353,11 +1384,11 @@ get_out_of_range_cpuid_entry(struct kvm_vcpu *vcpu, u32 *fn_ptr, u32 index)
> >                 return NULL;
> >  
> >         if (function >= 0x40000000 && function <= 0x4fffffff)
> > -               class = kvm_find_cpuid_entry(vcpu, function & 0xffffff00, 0);
> > +               class = kvm_find_cpuid_entry(vcpu, function & 0xffffff00);
> >         else if (function >= 0xc0000000)
> > -               class = kvm_find_cpuid_entry(vcpu, 0xc0000000, 0);
> > +               class = kvm_find_cpuid_entry(vcpu, 0xc0000000);
> >         else
> > -               class = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
> > +               class = kvm_find_cpuid_entry(vcpu, function & 0x80000000);
> This assumes that all the classes has first entry whose EAX specifies max leaf
> for this class. True for sure for basic and extended features, don't know
> if true for hypervisor and Centaur entries. Seems OK.

It holds true for all known hypervisors. There's no formal definition for using
0x400000yy as the hypervisor range, but the de facto standard is to use EBX, ECX,
and EDX for the signature, and EAX for the max leaf.

The Centaur behavior is very much a guess, but odds are it's a correct guess. When
I added the Centaur code, I spent far too much time trying (and failing) to hunt
down documentation.