Re: [PATCH v4 26/30] KVM: x86: Don't treat interrupts as allowed just because a nested run is pending

From: Yosry Ahmed

Date: Mon Jun 15 2026 - 12:43:23 EST


On Fri, Jun 12, 2026 at 5:04 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> When querying whether or not interrupts (IRQs) are allowed, check for a
> pending nested run _after_ checking whether or not interrupts are blocked.
> If L1 is running L2 _without_ nested_exit_on_intr(), i.e. if L1 IRQs can
> be blocked while running L2, and interrupts will indeed be blocked once the
> nested VM-Enter to L2 is completed, then KVM should treat interrupts as not
> being allowed.
>
> For injection, this avoids an unnecessary (forced) VM-Exit, as KVM can
> immediately request an IRQ window, instead of forcing an exit and _then_
> requesting an IRQ window (because after the forced exit, KVM will see that
> interrupts are blocked).
>
> For non-injection usage, only kvm_vcpu_ready_for_interrupt_injection() is
> affected in practice. kvm_vcpu_has_events() is unreachable when a nested
> run is pending, as KVM clears nested_run_pending prior to calling
> kvm_emulate_halt_noskip() when putting L2 into HLT via GUEST_ACTIVITY_HLT,
> and SVM has no equivalent to GUEST_ACTIVITY_STATE. I.e. the vCPU will
> always be runnable if a nested run is pending, and thus
> kvm_arch_vcpu_runnable() => kvm_vcpu_has_events() is effectively dead code,
> as is __kvm_emulate_halt() => kvm_vcpu_has_events(). Oh, and TDX doesn't
> support nested VMX. Similarly, kvm_can_do_async_pf() is unreachable as
> KVM shouldn't be faulting in memory with a pending nested VM-Enter.
>
> As for kvm_vcpu_ready_for_interrupt_injection(), incorrectly treating
> interrupts as being allowed could result in KVM prematurely exiting to
> userspace to accept an ExtINT.

"incorrectly treating interrupts as being allowed" is the status quo,
that this patch fixes, not sth this patch introduces -- right?

The changelog reads like for the non-injection case this change might
not be the right thing to do, but I don't think this is the case? I
assume returning false from
kvm_vcpu_ready_for_interrupt_injection() and kvm_vcpu_has_events() if
L1's interrupts are blocked while L2 is running is the right thing to
do?

The code makes sense to me but I am trying to make sense of the changelog.

Aside from that, I have two comments about existing issues (Sashiko-style):

1. The return values of {vmx/svm}_interrupt_allowed() are annoying.
IIUC, 0 is not allowed, 1 is allowed, and -EBUSY is generally allowed
but not right now, request immediate exit and try again? That should
be documented somewhere (maybe I just missed it?).

2. Aside from TDX, seems like {vmx/svm}_interrupt_allowed() are doing
the same thing? So maybe move all that logic into
kvm_arch_interrupt_allowed(), rename it (because it's only used by
x86), and make interrupt_blocked() the actual per-vendor callback? I
assume this is the case because nested_run_pending used to be
per-vendor, but now we can clean this up. For TDX, I assume the
per-vendor hook can just be tdx_interrupt_allowed() reversed?

> But, KVM will still hold/block the ExtINT
> and request its own IRQ window. I.e. the net effect is more or less the
> same as the for-injection case.
>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/svm/svm.c | 6 +++---
> arch/x86/kvm/vmx/vmx.c | 5 ++++-
> 2 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 526e0fdcd16b..883d5f703391 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4055,12 +4055,12 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> - if (vcpu->arch.nested_run_pending)
> - return -EBUSY;
> -
> if (svm_interrupt_blocked(vcpu))
> return 0;
>
> + if (vcpu->arch.nested_run_pending)
> + return -EBUSY;
> +
> /*
> * An IRQ must not be injected into L2 if it's supposed to VM-Exit,
> * e.g. if the IRQ arrived asynchronously after checking nested events.
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2c120ff47790..9b277e36a521 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5254,6 +5254,9 @@ bool vmx_interrupt_blocked(struct kvm_vcpu *vcpu)
>
> int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection)
> {
> + if (vmx_interrupt_blocked(vcpu))
> + return 0;
> +
> if (vcpu->arch.nested_run_pending)
> return -EBUSY;
>
> @@ -5264,7 +5267,7 @@ int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection)
> if (for_injection && is_guest_mode(vcpu) && nested_exit_on_intr(vcpu))
> return -EBUSY;
>
> - return !vmx_interrupt_blocked(vcpu);
> + return 1;
> }
>
> int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
> --
> 2.54.0.1136.gdb2ca164c4-goog
>