Re: [PATCH v2 46/49] KVM: x86: Replace (almost) all guest CPUID feature queries with cpu_caps

From: Maxim Levitsky
Date: Wed Jul 24 2024 - 14:02:21 EST


On Tue, 2024-07-09 at 12:20 -0700, Sean Christopherson wrote:
> On Thu, Jul 04, 2024, Maxim Levitsky wrote:
> > On Fri, 2024-05-17 at 10:39 -0700, Sean Christopherson wrote:
> > > +static __always_inline bool guest_cpuid_has(struct kvm_vcpu *vcpu,
> > > + unsigned int x86_feature)
> > > {
> > > const struct cpuid_reg cpuid = x86_feature_cpuid(x86_feature);
> > > struct kvm_cpuid_entry2 *entry;
> > > + u32 *reg;
> > > +
> > > + /*
> > > + * XSAVES is a special snowflake. Due to lack of a dedicated intercept
> > > + * on SVM, KVM must assume that XSAVES (and thus XRSTORS) is usable by
> > > + * the guest if the host supports XSAVES and *XSAVE* is exposed to the
> > > + * guest. Although the guest can read/write XSS via XSAVES/XRSTORS, to
> > > + * minimize the virtualization hole, KVM rejects attempts to read/write
> > > + * XSS via RDMSR/WRMSR. To make that work, KVM needs to check the raw
> > > + * guest CPUID, not KVM's view of guest capabilities.
> >
> > Hi,
> >
> > I think that this comment is wrong:
> >
> > The guest can't read/write XSS via XSAVES/XRSTORS. It can only use XSAVES/XRSTORS
> > to save/restore features that are enabled in XSS, and thus if there are none enabled,
> > the XSAVES/XRSTORS acts as more or less XSAVEOPTC/XRSTOR except working only when CPL=0)
>
> Doh, right you are.
>
> > So I don't think that there is a virtualization hole except the fact that VMM can't
> > really disable XSAVES if it chooses to.
>
> There is still a hole. If XSAVES is not supported, KVM runs the guest with the
> host XSS. See the conditional switching in kvm_load_{guest,host}_xsave_state().
> Not treating XSAVES as being available to the guest would allow the guest to read
> and write host supervisor state.
Makes sense. The remaining virtualization hole is indeed that we can't disable XSAVES,
even if userspace chooses to, we still can't.


>
> I'll rewrite the comment to call that.
>
> > Another "half virtualization hole" is that since we have chosen to not
> > intercept XSAVES at all, (AMD can't do this at all, and it's slow anyway) we
> > instead opted to never support some XSS bits (so far all of them, only
> > upcoming CET will add a few supported bits).
> >
> > This creates an unexpected situation for the guest - enabled feature (e.g PT)
> > but no XSS bit supported to context switch it. x86 arch does allow this
> > though.


Best regards,
Maxim Levitsky