Re: [PATCH v6 16/31] KVM: nSVM: Unify handling of VMRUN failures with proper cleanup

From: Yosry Ahmed

Date: Fri Feb 27 2026 - 19:47:12 EST

On Fri, Feb 27, 2026 at 4:07 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Tue, Feb 24, 2026, Yosry Ahmed wrote:
> > There are currently two possible causes of VMRUN failures emulated by
> > KVM:
> >
> > 1) Consistency checks failures. In this case, KVM updates the exit code
> > in the mapped VMCB12 and exits early in nested_svm_vmrun(). This
> > causes a few problems:
> >
> > A) KVM does not clear the GIF if the early consistency checks fail
> > (because nested_svm_vmexit() is not called). Nothing requires
> > GIF=0 before a VMRUN, from the APM:
> >
> > It is assumed that VMM software cleared GIF some time before
> > executing the VMRUN instruction, to ensure an atomic state
> > switch.
> >
> > So an early #VMEXIT from early consistency checks could leave the
> > GIF set.
> >
> > B) svm_leave_smm() is missing consistency checks on the newly loaded
> > guest state, because the checks aren't performed by
> > enter_svm_guest_mode().
>
> This is flat out wrong. RSM isn't missing any consistency checks that are
> provided by nested_vmcb_check_save().
>
> if (CC(!(save->efer & EFER_SVME))) <=== irrelevant given KVM's implementation
> return false;
>
> if (CC((save->cr0 & X86_CR0_CD) == 0 && (save->cr0 & X86_CR0_NW)) || <== kvm_set_cr0() in rsm_enter_protected_mode()
> CC(save->cr0 & ~0xffffffffULL))
> return false;
>
> if (CC(!kvm_dr6_valid(save->dr6)) || CC(!kvm_dr7_valid(save->dr7))) <== kvm_set_dr() in rsm_load_state_{32,64}
> return false;
>
> /*
> * These checks are also performed by KVM_SET_SREGS,
> * except that EFER.LMA is not checked by SVM against
> * CR0.PG && EFER.LME.
> */
> if ((save->efer & EFER_LME) && (save->cr0 & X86_CR0_PG)) {
> if (CC(!(save->cr4 & X86_CR4_PAE)) || <== kvm_set_cr4() in rsm_enter_protected_mode()
> CC(!(save->cr0 & X86_CR0_PE)) || <== kvm_set_cr0() in rsm_enter_protected_mode()
> CC(!kvm_vcpu_is_legal_cr3(vcpu, save->cr3))) <== kvm_set_cr3() in rsm_enter_protected_mode()
> return false;
> }
>
> /* Note, SVM doesn't have any additional restrictions on CR4. */
> if (CC(!__kvm_is_valid_cr4(vcpu, save->cr4))) <== kvm_set_cr4() in rsm_enter_protected_mode()
> return false;
>
> if (CC(!kvm_valid_efer(vcpu, save->efer))) <== __kvm_emulate_msr_write() in rsm_load_state_64()
> return false;
>
> Even if RSM were missing checks on the L2 state being loaded, I'm not willing to
> take on _any_ complexity in nested VMRUN to make RSM suck a little less. KVM's
> L2 => SMM => RSM => L2 is fundamentally broken. Anyone that argues otherwise is
> ignoring architecturally defined behavior in the SDM and APM.
>
> If someone wants to actually put in the effort to properly emulating SMI => RSM
> from L2, then I'd be happy to take on some complexity, but even then it's not at
> all clear that it would be necessary.
>
> > 2) Failure to load L2's CR3 or merge the MSR bitmaps. In this case, a
> > fully-fledged #VMEXIT injection is performed as VMCB02 is already
> > prepared.
> >
> > Arguably all VMRUN failures should be handled before the VMCB02 is
> > prepared, but with proper cleanup (e.g. clear the GIF).
>
> Eh, so long as KVM cleans up after itself, I don't see anything wrong with
> preparing some of vmcb02.
>
> So after staring at this for some time, us having gone through multiple attempts
> to get things right, and this being tagged for stable@, unless I'm missing some
> massive simplification this provides down the road, I am strongly against refactoring
> this code, and 100% against reworking things to "fix" SMM.

For context, this patch (and others you quoted below) were a direct
result of this discussion in v2:
https://lore.kernel.org/kvm/aThIQzni6fC1qdgj@xxxxxxxxxx/. I didn't
look too closely into the SMM bug tbh I just copy/pasted that verbatim
into the changelog.

As for refactoring the code, I didn't really do it for SMM, but I
think the code is generally cleaner with the single VMRUN failure
path. That being said..

> And so for the stable@ patches, I'm also opposed to all of these:
>
> KVM: nSVM: Refactor minimal #VMEXIT handling out of nested_svm_vmexit()
> KVM: nSVM: Call nested_svm_init_mmu_context() before switching to VMCB02
> KVM: nSVM: Call nested_svm_merge_msrpm() from enter_svm_guest_mode()
> KVM: nSVM: Make nested_svm_merge_msrpm() return an errno
> KVM: nSVM: Call enter_guest_mode() before switching to VMCB02
> KVM: nSVM: Drop nested_vmcb_check_{save/control}() wrappers
>
> unless they're *needed* by some later commit (I didn't look super closely).
>
> For stable@, just fix the GIF case and move on.

.. I am not sure if you mean dropping them completely, or dropping
them from stable@.

I am fine with dropping the stable@ tag from everything from this
point onward, or re-ordering the patches to keep it for the missing
consistency checks.

If you mean drop them completely, it's a bit of a shame because I
think the code ends up looking much better, but I also understand
given all the back-and-forth, and the new problem I reported recently
that will need further refactoring to address (see my other reply to
the same patch).

Let me know how you want to proceed: drop the patches entirely and
just fix GIF, or fix GIF first for stable, and keep the refactoring
for non-stable.

>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index d734cd5eef5e..d9790e37d4e8 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1036,6 +1036,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
> vmcb12->control.exit_code = SVM_EXIT_ERR;
> vmcb12->control.exit_info_1 = 0;
> vmcb12->control.exit_info_2 = 0;
> + svm_set_gif(svm, false);
> goto out;
> }
>
> Sorry for not catching this earlier, I didn't actually read the changelog until
> this version. /facepalm