Re: [PATCH v2 10/13] KVM: nSVM: Restrict mapping VMCB12 on nested VMRUN
From: Sean Christopherson
Date: Fri Dec 12 2025 - 18:30:54 EST
On Thu, Dec 11, 2025, Yosry Ahmed wrote:
> On Wed, Dec 10, 2025 at 11:05:44PM +0000, Yosry Ahmed wrote:
> > On Tue, Dec 09, 2025 at 08:03:15AM -0800, Sean Christopherson wrote:
> > > On Mon, Nov 10, 2025, Yosry Ahmed wrote:
> > Unfortunately this doesn't work, it breaks the newly introduced
> > nested_invalid_cr3_test. The problem is that we bail before we fully
> > initialize VMCB02, then nested_svm_vmrun() calls nested_svm_vmexit(),
> > which restores state from VMCB02 to VMCB12.
> >
> > The test first tries to run L2 with a messed up CR3, which fails but
> > corrupts VMCB12 due to the above, then the second nested entry is
> > screwed.
> >
> > There are two fixes, the easy one is just move the consistency checks
> > after nested_vmcb02_prepare_control() and nested_vmcb02_prepare_save()
> > (like the existing failure mode of nested_svm_load_cr3()). This works,
> > but the code doesn't make a lot of sense because we use VMCB12 to create
> > VMCB02 and THEN check that VMCB12 is valid.
> >
> > The alternative is unfortunately a lot more involved. We only do a
> > partial restore or a "fast #VMEXIT" for failed VMRUNs. We'd need to:
> >
> > 1) Move nested_svm_load_cr3() above nested_vmcb02_prepare_control(),
> > which needs moving nested_svm_init_mmu_context() out of
> > nested_vmcb02_prepare_control() to remain before
> > nested_svm_load_cr3().
> >
> > This makes sure a failed nested VMRUN always needs a "fast #VMEXIT"
> >
> > 2) Figure out which parts of nested_svm_vmexit() are needed in the
> > failed VMRUN case. We need to at least switch the VMCB, propagate the
> > error code, and do some cleanups. We can split this out into the
> > "fast #VMEXIT" path, and use it for failed VMRUNs.
> >
> > Let me know which way you prefer.
>
> I think I prefer (2), the code looks cleaner and I like having a
> separate code path for VMRUN failures. Unless there are objections, I
> will do that in the next version.
With the caveat that I haven't seen the code, that has my vote too. nVMX has a
similar flow, and logically this is equivalent, at least to me. We can probably
even use similar terminology, e.g. vmrun_fail_vmexit instead of vmentry_fail_vmext.
vmentry_fail_vmexit:
vmx_switch_vmcs(vcpu, &vmx->vmcs01);
if (!from_vmentry)
return NVMX_VMENTRY_VMEXIT;
load_vmcs12_host_state(vcpu, vmcs12);
vmcs12->vm_exit_reason = exit_reason.full;
if (enable_shadow_vmcs || nested_vmx_is_evmptr12_valid(vmx))
vmx->nested.need_vmcs12_to_shadow_sync = true;
return NVMX_VMENTRY_VMEXIT;