Re: [RFC PATCH] KVM: x86: Disallow KVM_SET_CPUID{2} if the vCPU is in guest mode

From: Sean Christopherson
Date: Wed Dec 18 2019 - 16:24:45 EST


On Wed, Dec 18, 2019 at 12:57:41PM -0800, Jim Mattson wrote:
> On Wed, Dec 18, 2019 at 12:10 PM Sean Christopherson
> <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> > On Wed, Dec 18, 2019 at 11:38:43AM -0800, Jim Mattson wrote:
> > > On Wed, Dec 18, 2019 at 9:42 AM Sean Christopherson
> > > <sean.j.christopherson@xxxxxxxxx> wrote:
> > > >
> > > > Reject KVM_SET_CPUID{2} with -EBUSY if the vCPU is in guest mode (L2) to
> > > > avoid complications and potentially undesirable KVM behavior. Allowing
> > > > userspace to change a guest's capabilities while L2 is active would at
> > > > best result in unexpected behavior in the guest (L1 or L2), and at worst
> > > > induce bad KVM behavior by breaking fundamental assumptions regarding
> > > > transitions between L0, L1 and L2.
> > >
> > > This seems a bit contrived. As long as we're breaking the ABI, can we
> > > disallow changes to CPUID once the vCPU has been powered on?
> >
> > I can at least concoct scenarios where changing CPUID after KVM_RUN
> > provides value, e.g. effectively creating a new VM/vCPU without destroying
> > the kernel's underlying data structures and without putting the file
> > descriptors, for performance (especially if KVM avoids its hardware on/off
> > paths) or sandboxing (process has access to a VM fd, but not /dev/kvm).
> >
> > A truly contrived, but technically architecturally accurate, scenario would
> > be modeling SGX interaction with the machine check architecutre. Per the
> > SDM, #MCs or clearing bits in IA32_MCi_CTL disable SGX, which is reflected
> > in CPUID:
> >
> > Any machine check exception (#MC) that occurs after Intel SGX is first
> > enables causes Intel SGX to be disabled, (CPUID.SGX_Leaf.0:EAX[SGX1] == 0)
> > It cannot be enabled until after the next reset.
> >
> > Any act of clearing bits from '1 to '0 in any of the IA32_MCi_CTL register
> > may disable Intel SGX (set CPUID.SGX_Leaf.0:EAX[SGX1] to 0) until the next
> > reset.
> >
> > I doubt a userspace VMM would actively model that behavior, but it's at
> > least theoretically possible. Yes, it would technically be possible for
> > SGX to be disabled while L2 is active, but I don't think it's unreasonable
> > to require userspace to first force the vCPU out of L2.
>
> IIt's perfectly reasonable for a machine check to be handled by L2, in
> which case, it would be rather onerous to require userspace to force
> the vCPU out of L2 to clear CPUID.SGX_Leaf.0:EAX[SGX1].

Hrm. I just had to go and think of SGX... I guess it's probably best to
suck it up and have CET update the right bitmaps.