Re: [PATCH 0/2] KVM: nVMX: vmcs.SYSENTER optimization and "fix"

From: Sean Christopherson
Date: Wed Apr 29 2020 - 11:11:12 EST


On Tue, Apr 28, 2020 at 04:45:25PM -0700, Jim Mattson wrote:
> On Tue, Apr 28, 2020 at 4:10 PM Sean Christopherson
> <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> > Patch 1 is a "fix" for handling SYSENTER_EIP/ESP in L2 on a 32-bit vCPU.
> > The primary motivation is to provide consistent behavior after patch 2.
> >
> > Patch 2 is essentially a re-submission of a nested VMX optimization to
> > avoid redundant VMREADs to the SYSENTER fields in the nested VM-Exit path.
> >
> > After patch 2 and without patch 1, KVM would end up with weird behavior
> > where L1 and L2 would only see 32-bit values for their own SYSENTER_E*P
> > MSRs, but L1 could see a 64-bit value for L2's MSRs.
> >
> > Sean Christopherson (2):
> > KVM: nVMX: Truncate writes to vmcs.SYSENTER_EIP/ESP for 32-bit vCPU
> > KVM: nVMX: Drop superfluous VMREAD of vmcs02.GUEST_SYSENTER_*
> >
> > arch/x86/kvm/vmx/nested.c | 4 ----
> > arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++--
> > 2 files changed, 16 insertions(+), 6 deletions(-)
>
> It seems like this could be fixed more generally by truncating
> natural-width fields on 32-bit vCPUs in handle_vmwrite(). However,
> that also would imply that we can't shadow any natural-width fields on
> a 32-bit vCPU.

handle_vmwrite() and handle_vmread() already correctly handle truncating
writes/reads when L1 isn't in 64-bit mode.

This path is effectively out-of-band, for lack of a better phrase. The
WRMSR is intercepted and the data is stuffed into vmcs02. Without these
patches, the effective L2 state depends on the underlying hardware
capabilities, e.g. L2 gets 64-bit behavior if L0 is a 64-bit CPU, and
32-bit behavior if L0 is a 32-bit CPU. It's "wrong", but consistent as the
value seen by L2 is the same value that is saved into vmcs12. Of course in
the 64-bit CPU case, L1 can't actually see the full value via VMREAD as the
vCPU is 32-bit, but at least the underlying memory/machinery is consistent.

With just patch 2, the above would still be true for 64-bit L0, but for
32-bit L0 it would result in L2 seeing a 32-bit value while saving a 64-bit
value into vmcs12. Again, L1 wouldn't see the 64-bit value when using
VMREAD, but the value in memory is still wrong-ish.

Truncating the value on WRMSR interception makes the behavior fully
dependent on the vCPU capabilities, i.e. what L2 sees is the same value
that's saved into vmcs12, which is the same value seen by VMREAD in L1,
irrespective of whether L0 is 64-bit or 32-bit.