[PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail

From: Paolo Bonzini

Date: Thu Jun 04 2026 - 12:19:12 EST


Upon a VM-entry failure that is caught by the processor rather than
KVM, nested_vmx_restore_host_state() restores L1's CR3 but not the
PDPTRs. If shadow paging is used (enable_ept is false), the L2
PDPTRs loaded during the aborted entry attempt remain in
vcpu->arch.mmu->pdptrs[].

Note that the fact that the PDPTRs are stored in the MMU does not
save the day, because KVM only uses root_mmu if enable_ept is false.

To fix this, use nested_vmx_load_cr3() instead of open coding
just the load of vcpu->arch.cr3, in the same guise as
load_vmcs12_host_state(). nested_vmx_load_cr3() will mark the
register as dirty rather than available, but this is only a
very minor pessimization.

If EPT *is* in use, do not load the PDPTRs and rely solely on
ept_save_pdptrs() to reload them from VMCS01. When vmx_load_mmu_pgd()
runs on the next entry, the PDPTRs are available---meaning they are
not incorrectly reloaded from memory.

kvm_mmu_unload() is preserved to keep the paths from the old
kvm_mmu_reset_context(), but is actually unnecessary. It can
be removed as a separate patch.

Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
---
arch/x86/kvm/vmx/nested.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4690a4d23709..d612a5d071fc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4947,6 +4947,7 @@ static inline u64 nested_vmx_get_vmcs01_guest_efer(struct vcpu_vmx *vmx)

static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
{
+ enum vm_entry_failure_code ignored;
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct vmx_msr_entry g, h;
@@ -4984,20 +4985,19 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
vmx_set_cr4(vcpu, vmcs_readl(CR4_READ_SHADOW));

nested_ept_uninit_mmu_context(vcpu);
- vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
- kvm_register_mark_available(vcpu, VCPU_REG_CR3);

/*
- * Use ept_save_pdptrs(vcpu) to load the MMU's cached PDPTRs
- * from vmcs01 (if necessary). The PDPTRs are not loaded on
- * VMFail, like everything else we just need to ensure our
- * software model is up-to-date.
+ * Now that nested EPT has been disabled, load the MMU's CR3 and
+ * possibly PDPTRs from vmcs01 (if necessary). This should not
+ * happen for VMFail, but we get here if the check was caught by
+ * the processor and therefore the guest CR3 was loaded prematurely.
*/
+ kvm_mmu_unload(vcpu);
+ if (nested_vmx_load_cr3(vcpu, vmcs_readl(GUEST_CR3), false, !enable_ept, &ignored))
+ nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
if (enable_ept && is_pae_paging(vcpu))
ept_save_pdptrs(vcpu);

- kvm_mmu_reset_context(vcpu);
-
/*
* This nasty bit of open coding is a compromise between blindly
* loading L1's MSRs using the exit load lists (incorrect emulation
--
2.52.0