Re: [PATCH 3/3] KVM: x86: Check for injected exceptions before queuing a debug exception
From: Yosry Ahmed
Date: Fri Feb 27 2026 - 13:34:39 EST
> > That being said, I hate nested_run_in_progress. It's too close to
> > nested_run_pending and I am pretty sure they will be mixed up.
>
> Agreed, though the fact that name is _too_ close means that, aside from the
> potential for disaster (minor detail), it's accurate.
>
> One thought is to hide nested_run_in_progress beyond a KConfig, so that attempts
> to use it for anything but the sanity check(s) would fail the build. I don't
> really want to create yet another KVM_PROVE_xxx though, but unlike KVM_PROVE_MMU,
> I think we want to this enabled in production.
>
> I'll chew on this a bit...
Maybe (if we go this direction) name it very explicitly
warn_on_nested_exception if it's only intended to be used for the
sanity checks?
>
> > exception_from_userspace's name made me think this is something we
> > could key off to WARN, but it's meant to morph queued exceptions from
> > userspace into an "exception_vmexit" if needed. The field name is
> > generic but its functionality isn't, maybe it should have been called
> > exception_check_vmexit or something. Anyway..
>
> No? It's not a "check", it's literally an pending exception that has been morphed
> to a VM-Exit.
I meant that the exception_from_userspace flag means "KVM should check
if this exception should be morphed to a VM-Exit". It doesn't mean
that the exception has already morphed or will necessarily morph. So
exception_check_vmexit makes sense to me, if it's set, KVM checks if
we need to morph the exception to a VM-Exit.
> > That gave me an idea though, can we add a field to
> > kvm_queued_exception to identify the origin of the exception
> > (userspace vs. KVM)? Then we can key the warning off of that.
>
> That would incur non-trivial maintenance costs, and it would be tricky to get the
> broader protection of the existing WARNing "right". E.g. how would KVM know that
> the VM-Exit was originally induced by an exception that was queued by userspace?
It should have the info when morphing a pending exception to a
VM-Exit, assuming whoever is queuing the exception is passing it in,
but yeah I see how this can be a burden.
>
> > We can potentially also avoid adding the field and just plumb the
> > argument through to kvm_multiple_exception(), and WARN there if
> > nested_run_pending is set and the origin is not userspace?
>
> Not really, because kvm_vcpu_ioctl_x86_set_vcpu_events() doesn't use
> kvm_queued_exception(), it stuffs things directly.
Right, what I had in mind was that by default exceptions are assumed
to be queued by KVM, so kvm_vcpu_ioctl_x86_set_vcpu_events() doesn't
need to change. Basically, if a code path is queuing an exception from
userspace, it should use a new variant of kvm_queued_exception() (e.g.
kvm_queue_exception_u()). If it's stuffing things directly, nothing to
do. I think the code paths queuing exceptions from userspace should be
limited, so it should be fine to do this.
That being said, this still WARNs on the queuing side, not on the
checking side, so if you think that's not the right thing to do in
general then scratch this too.
>
> That said, if you want to try and code it up, I say go for it. Worst case scenario
> you'll have wasted a bit of time.
I meant something like this (completely untested), this just WARNs if
we try to queue the exception, doesn't really stop it:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0c8aacf1fa67f..6f4148eae08be 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -836,11 +836,14 @@ static void kvm_queue_exception_vmexit(struct
kvm_vcpu *vcpu, unsigned int vecto
static void kvm_multiple_exception(struct kvm_vcpu *vcpu, unsigned int nr,
bool has_error, u32 error_code,
- bool has_payload, unsigned long payload)
+ bool has_payload, unsigned long payload,
+ bool from_userspace)
{
u32 prev_nr;
int class1, class2;
+ WARN_ON_ONCE(!from_userspace && vcpu->arch.nested_run_pending);
+
kvm_make_request(KVM_REQ_EVENT, vcpu);
/*
@@ -899,7 +902,7 @@ static void kvm_multiple_exception(struct kvm_vcpu
*vcpu, unsigned int nr,
void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr)
{
- kvm_multiple_exception(vcpu, nr, false, 0, false, 0);
+ kvm_multiple_exception(vcpu, nr, false, 0, false, 0, false);
}
EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_queue_exception);
@@ -907,14 +910,19 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_queue_exception);
void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
unsigned long payload)
{
- kvm_multiple_exception(vcpu, nr, false, 0, true, payload);
+ kvm_multiple_exception(vcpu, nr, false, 0, true, payload, false);
}
EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_queue_exception_p);
static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
u32 error_code, unsigned long payload)
{
- kvm_multiple_exception(vcpu, nr, true, error_code, true, payload);
+ kvm_multiple_exception(vcpu, nr, true, error_code, true,
payload, false);
+}
+
+static void kvm_queue_exception_u(struct kvm_vcpu *vcpu, unsigned nr)
+{
+ kvm_multiple_exception(vcpu, nr, false, 0, false, 0, true);
}
void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned int nr,
@@ -1015,7 +1023,7 @@ void kvm_inject_nmi(struct kvm_vcpu *vcpu)
void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code)
{
- kvm_multiple_exception(vcpu, nr, true, error_code, false, 0);
+ kvm_multiple_exception(vcpu, nr, true, error_code, false, 0, false);
}
EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_queue_exception_e);
@@ -5519,7 +5527,7 @@ static int kvm_vcpu_ioctl_x86_set_mce(struct
kvm_vcpu *vcpu,
banks[3] = mce->misc;
vcpu->arch.mcg_status = mce->mcg_status;
banks[1] = mce->status;
- kvm_queue_exception(vcpu, MC_VECTOR);
+ kvm_queue_exception_u(vcpu, MC_VECTOR);
} else if (!(banks[1] & MCI_STATUS_VAL)
|| !(banks[1] & MCI_STATUS_UC)) {
if (banks[1] & MCI_STATUS_VAL)
@@ -12546,9 +12554,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct
kvm_vcpu *vcpu,
if (kvm_is_exception_pending(vcpu) ||
vcpu->arch.exception.injected)
goto out;
if (dbg->control & KVM_GUESTDBG_INJECT_DB)
- kvm_queue_exception(vcpu, DB_VECTOR);
+ kvm_queue_exception_u(vcpu, DB_VECTOR);
else
- kvm_queue_exception(vcpu, BP_VECTOR);
+ kvm_queue_exception_u(vcpu, BP_VECTOR);
}
/*