Re: [PATCH v1 2/4] KVM: nSVM: Delay stuffing L2's current RIP into NextRIP until vCPU run
From: Sean Christopherson
Date: Tue Feb 24 2026 - 19:56:29 EST
On Tue, Feb 24, 2026, Yosry Ahmed wrote:
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 8f8bc863e2143..e084b9688f556 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -1413,6 +1413,24 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> > sd->bp_spec_reduce_set = true;
> > msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> > }
> > +
> > + /*
> > + * If nrips is supported in hardware but not exposed to L1, stuff the
> > + * actual L2 RIP to emulate what a nrips=0 CPU would do (L1 is
> > + * responsible for advancing RIP prior to injecting the event). Once L2
> > + * runs after L1 executes VMRUN, NextRIP is updated by the CPU and/or
> > + * KVM, and this is no longer needed.
> > + *
> > + * This is done here (as opposed to when preparing vmcb02) to use the
> > + * most up-to-date value of RIP regardless of the order of restoring
> > + * registers and nested state in the vCPU save+restore path.
> > + */
> > + if (is_guest_mode(vcpu) && svm->nested.nested_run_pending) {
> > + if (boot_cpu_has(X86_FEATURE_NRIPS) &&
> > + !guest_cpu_cap_has(vcpu, X86_FEATURE_NRIPS))
> > + svm->vmcb->control.next_rip = kvm_rip_read(vcpu);
> > + }
> > +
>
> Doing this in svm_prepare_switch_to_guest() is wrong, or at least
> after the svm->guest_state_loaded check. It's possible to emulate the
> nested VMRUN without doing a vcpu_put(), which means
> svm->guest_state_loaded will remain true and this code will be
> skipped.
>
> In fact, this breaks the svm_nested_soft_inject_test test. Funny
> enough, I was only running it with my repro changes, which papered
> over the bug because it forced an exit to userspace after VMRUN due to
> single-stepping, so svm->guest_state_loaded got cleared and the code
> was executed on the next KVM_RUN, before L2 runs.
>
> I can move it above the svm->guest_state_loaded check, but I think I
> will just put it in pre_svm_run() instead.
I would rather not expand pre_svm_run(), and instead just open code it in
svm_vcpu_run(). pre_svm_run() probably should never have been added, because
it's far from a generic "pre run" API. E.g. if we want to keep the helper around,
it should probably be named something something ASID.