[PATCH v1 2/4] KVM: nSVM: Delay stuffing L2's current RIP into NextRIP until vCPU run

From: Yosry Ahmed

Date: Mon Feb 23 2026 - 10:51:49 EST


For guests with NRIPS disabled, L1 does not provide NextRIP when running
an L2 with an injected soft interrupt, instead it advances L2's RIP
before running it. KVM uses L2's current RIP as the NextRIP in vmcb02 to
emulate a CPU without NRIPS.

However, in svm_set_nested_state(), the value used for L2's current RIP
comes from vmcb02, which is just whatever the vCPU had in vmcb02 before
restoring nested state (zero on a freshly created vCPU). Passing the
cached RIP value instead (i.e. kvm_rip_read()) would only fix the issue
if registers are restored before nested state.

Instead, split the logic of setting NextRIP in vmcb02. Handle the
'normal' case of initializing vmcb02's NextRIP using NextRIP from vmcb12
(or KVM_GET_NESTED_STATE's payload) in nested_vmcb02_prepare_control().
Delay the special case of stuffing L2's current RIP into vmcb02's
NextRIP until shortly before the vCPU is run, to make sure the most
up-to-date value of RIP is used regardless of KVM_SET_REGS and
KVM_SET_NESTED_STATE's relative ordering.

Fixes: cc440cdad5b7 ("KVM: nSVM: implement KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE")
CC: stable@xxxxxxxxxxxxxxx
Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
Signed-off-by: Yosry Ahmed <yosry@xxxxxxxxxx>
---
arch/x86/kvm/svm/nested.c | 25 ++++++++-----------------
arch/x86/kvm/svm/svm.c | 18 ++++++++++++++++++
2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index a82e6f0472ca7..b7c80aeaebab3 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -844,24 +844,15 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm,
vmcb02->control.event_inj_err = svm->nested.ctl.event_inj_err;

/*
- * NextRIP is consumed on VMRUN as the return address pushed on the
- * stack for injected soft exceptions/interrupts. If nrips is exposed
- * to L1, take it verbatim from vmcb12.
- *
- * If nrips is supported in hardware but not exposed to L1, stuff the
- * actual L2 RIP to emulate what a nrips=0 CPU would do (L1 is
- * responsible for advancing RIP prior to injecting the event). This is
- * only the case for the first L2 run after VMRUN. After that (e.g.
- * during save/restore), NextRIP is updated by the CPU and/or KVM, and
- * the value of the L2 RIP from vmcb12 should not be used.
+ * If nrips is exposed to L1, take NextRIP as-is. Otherwise, L1
+ * advances L2's RIP before VMRUN instead of using NextRIP. KVM will
+ * stuff the current RIP as vmcb02's NextRIP before L2 is run. After
+ * the first run of L2 (e.g. after save+restore), NextRIP is updated by
+ * the CPU and/or KVM and should be used regardless of L1's support.
*/
- if (boot_cpu_has(X86_FEATURE_NRIPS)) {
- if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS) ||
- !svm->nested.nested_run_pending)
- vmcb02->control.next_rip = svm->nested.ctl.next_rip;
- else
- vmcb02->control.next_rip = vmcb12_rip;
- }
+ if (guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS) ||
+ !svm->nested.nested_run_pending)
+ vmcb02->control.next_rip = svm->nested.ctl.next_rip;

svm->nmi_l1_to_l2 = is_evtinj_nmi(vmcb02->control.event_inj);
if (is_evtinj_soft(vmcb02->control.event_inj)) {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8f8bc863e2143..e084b9688f556 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1413,6 +1413,24 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
sd->bp_spec_reduce_set = true;
msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
}
+
+ /*
+ * If nrips is supported in hardware but not exposed to L1, stuff the
+ * actual L2 RIP to emulate what a nrips=0 CPU would do (L1 is
+ * responsible for advancing RIP prior to injecting the event). Once L2
+ * runs after L1 executes VMRUN, NextRIP is updated by the CPU and/or
+ * KVM, and this is no longer needed.
+ *
+ * This is done here (as opposed to when preparing vmcb02) to use the
+ * most up-to-date value of RIP regardless of the order of restoring
+ * registers and nested state in the vCPU save+restore path.
+ */
+ if (is_guest_mode(vcpu) && svm->nested.nested_run_pending) {
+ if (boot_cpu_has(X86_FEATURE_NRIPS) &&
+ !guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
+ svm->vmcb->control.next_rip = kvm_rip_read(vcpu);
+ }
+
svm->guest_state_loaded = true;
}

--
2.53.0.345.g96ddfc5eaa-goog