[PATCH v3 0/7] KVM: few more SMM fixes

From: Maxim Levitsky
Date: Mon Sep 13 2021 - 10:13:58 EST


These are few SMM fixes I was working on last week.

* Patch 1,2 fixes a minor issue that remained after
commit 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM")

While now, returns to guest mode from SMM work due to restored state from HSAVE
area, the guest entry still sees incorrect HSAVE state.

This for example breaks return from SMM when the guest is 32 bit, due to PDPTRs
loading which are done using incorrect MMU state which is incorrect,
because it was setup with incorrect L1 HSAVE state.

V3: updated with review feedback from Sean.

* Patch 3 fixes a theoretical issue that I introduced with my SREGS2 patchset,
which Sean Christopherson pointed out.

The issue is that KVM_REQ_GET_NESTED_STATE_PAGES request is not only used
for completing the load of the nested state, but it is also used to complete
exit from SMM to guest mode, and my compatibility hack of pdptrs_from_userspace
was done assuming that this is not done.

V3: I moved the reset of pdptrs_from_userspace to common x86 code.

* Patch 4 makes SVM SMM exit to be a bit more similar to how VMX does it
by also raising KVM_REQ_GET_NESTED_STATE_PAGES requests.

I do have doubts about why we need to do this on VMX though. The initial
justification for this comes from

7f7f1ba33cf2 ("KVM: x86: do not load vmcs12 pages while still in SMM")

With all the MMU changes, I am not sure that we can still have a case
of not up to date MMU when we enter the nested guest from SMM.
On SVM it does seem to work anyway without this.

* Patch 5 fixes guest emulation failure when unrestricted_guest=0 and we reach
handle_exception_nmi_irqoff.
That function takes stale values from current vmcs and fails not taking into account
the fact that we are emulating invalid guest state now, and thus no VM exit happened.

* Patch 6 fixed a corner case where return from SMM is slightly corrupting
the L2 segment register state when unrestricted_guest=0 due to real mode segement
caching register logic, but later it restores it correctly from SMMRAM.
Fix this by not failing nested_vmx_enter_non_root_mode and delaying this
failure to the next nested VM entry.

* Patch 7 fixes another corner case where emulation_required was not updated
correctly on nested VMexit when restoring the L1 segement registers.

I still track 2 SMM issues:

1. When HyperV guest is running nested, and uses SMM enabled OVMF, it crashes and
reboots during the boot process.

2. Nested migration on VMX is still broken when L1 floods itself with SMIs.

Best regards,
Maxim Levitsky

Maxim Levitsky (7):
KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
KVM: x86: nSVM: restore the L1 host state prior to resuming nested
guest on SMM exit
KVM: x86: reset pdptrs_from_userspace when exiting smm
KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM
mode
KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest
state
KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if
!from_vmentry
KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit

arch/x86/kvm/svm/nested.c | 9 ++-
arch/x86/kvm/svm/svm.c | 131 ++++++++++++++++++++------------------
arch/x86/kvm/svm/svm.h | 3 +-
arch/x86/kvm/vmx/nested.c | 9 ++-
arch/x86/kvm/vmx/vmx.c | 28 ++++++--
arch/x86/kvm/vmx/vmx.h | 1 +
arch/x86/kvm/x86.c | 7 ++
7 files changed, 113 insertions(+), 75 deletions(-)

--
2.26.3