[PATCH v7 01/17] KVM: nSVM: Stop leaking single-stepping on VMRUN into L2

From: Yosry Ahmed

Date: Wed May 27 2026 - 19:47:41 EST


According to the APM, TF on VMRUN causes a #DB after VMRUN completes on
the _host_ side. However, KVM injects a #DB in L2 context instead (or
exits to userspace if KVM_GUESTDBG_SINGLESTEP is set) in
kvm_skip_emulated_instruction().

Avoid single-step handling on VMRUN by open-coding the rest of
kvm_skip_emulated_instruction() in nested_svm_vmrun(). This doesn't look
pretty, but following changes will need to open-code
kvm_pmu_instruction_retired() anyway, and will cleanup the code. This
ignores TF on VMRUN instead of injecting a spurious exception into
L2. Document this virtualization hole with a FIXME.

Note that a failed VMRUN would have been correctly single-stepped, but
now TF is always ignored for consistency and simplicity purposes. VMX
does not support TF on a successful VMLAUNCH/VMRESUME, so it's unlikely
that single-stepping VMRUN properly is important, especially if it's
only for failed VMRUNs.

Fixes: c8e16b78c614 ("x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()")
Signed-off-by: Yosry Ahmed <yosry@xxxxxxxxxx>
---
arch/x86/kvm/svm/nested.c | 18 +++++++++++++++---
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/svm/svm.h | 2 ++
3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 28ac5d5c990dd..01e3e6fa8bbb1 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -30,6 +30,7 @@
#include "lapic.h"
#include "svm.h"
#include "hyperv.h"
+#include "pmu.h"

#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK

@@ -1145,11 +1146,22 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);

/* Advance RIP past VMRUN as part of the nested #VMEXIT. */
- return kvm_skip_emulated_instruction(vcpu);
+ if (!svm_skip_emulated_instruction(vcpu))
+ return 0;
+
+ kvm_pmu_instruction_retired(vcpu);
+ return 1;
}

- /* At this point, VMRUN is guaranteed to not fault; advance RIP. */
- ret = kvm_skip_emulated_instruction(vcpu);
+ /*
+ * At this point, VMRUN is guaranteed to not fault; advance RIP.
+ *
+ * FIXME: If TF is set on VMRUN should inject a #DB (or handle guest
+ * debugging) right after #VMEXIT, right now it's just ignored.
+ */
+ ret = svm_skip_emulated_instruction(vcpu);
+ if (ret)
+ kvm_pmu_instruction_retired(vcpu);

/*
* Since vmcb01 is not in use, we can use it to store some of the L1
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e74fcde6155ec..183e577802301 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -333,7 +333,7 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
return 1;
}

-static int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
+int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
{
return __svm_skip_emulated_instruction(vcpu, EMULTYPE_SKIP, true);
}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 2b6733dffd76f..e5d9984ef6320 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -832,6 +832,8 @@ static inline void svm_enable_intercept_for_msr(struct kvm_vcpu *vcpu,
svm_set_intercept_for_msr(vcpu, msr, type, true);
}

+int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu);
+
/* nested.c */

#define NESTED_EXIT_HOST 0 /* Exit handled on host level */
--
2.54.0.794.g4f17f83d09-goog