Re: [PATCH] KVM: SVM: Clear dummy V_IRQ in vmcb01 when deactivating AVIC
From: xinguo
Date: Wed Jun 10 2026 - 19:45:44 EST
Fair point, my changelog reasoning is incomplete and I owe you data
rather than speculation.
What I actually trigger is a workload that repeatedly toggles AVIC
on and off, i.e. avic_activate_vmcb() / avic_deactivate_vmcb() get
called many times in quick succession. Under that load the Windows
guest blue screens with STATUS_INTEGER_DIVIDE_BY_ZERO.
From the dump, Windows takes the bugcheck while dispatching an
interrupt: an unhandled #DE is raised inside the interrupt dispatch
path and ultimately reported by nt!KiInterruptHandler. The faulting
RIP saved in the trap frame is:
je nt!KiInterruptSubDispatchNoLockNoEtw+0xd5
which is a conditional branch, not a div/idiv. In other words, the
guest is being vectored through IDT entry 0 (#DE) at an instruction
boundary that has nothing to do with division, which is consistent
with the CPU delivering vector 0 from KVM rather than the guest
actually executing a faulting div. That is what made me suspect a
stale dummy V_IRQ (vector=0, V_IRQ=1) becoming effective once AVIC
is disabled.
I agree this needs to be backed by traces, not just by that
hypothesis. Let me instrument svm_set_vintr(), svm_clear_vintr(),
the intercept-recalc paths, and avic_deactivate_vmcb() to capture
vmcb01's int_ctl / int_vector / INTERCEPT_VINTR / is_guest_mode()
at each transition, reproduce the crash, and come back with the
actual call sequence that leaves vmcb01 in a state where V_IRQ
becomes effective once AVIC is disabled.
Please hold off on this patch in the meantime; I'll resend (or drop
it) based on what the trace shows.
Thanks for the review.
> 2026年6月10日 20:45,Sean Christopherson <seanjc@xxxxxxxxxx> 写道:
>
> On Wed, Jun 10, 2026, xin guo wrote:
>> When KVM requests an IRQ window via svm_set_vintr(), it programs a
>> dummy VINTR with int_vector=0 and V_IRQ=1 into the current VMCB.
>> These int_ctl fields are documented to be ignored while AVIC is
>> enabled, so the dummy VINTR is harmless during AVIC operation.
>>
>> However, avic_deactivate_vmcb() only clears AVIC_ENABLE_MASK and
>> X2APIC_MODE_MASK, and does not clear the VINTR injection state. Once
>> AVIC is disabled, hardware honors V_IRQ again and injects vector 0
>> into the guest on the next VMRUN. Windows guests observe this as a
>> spurious interrupt and crash, e.g. with STATUS_INTEGER_DIVIDE_BY_ZERO.
>
> Can you provide a reproducer, or at least instructions to reproduce? This feels
> like we're treating a symptom, not the underlying bug. And while I can definitely
> see KVM leaving a stale V_IRQ_MASK in vmcb01, I don't see how that can happen
> while also clearing INTERCEPT_VINTR, as the only place INTERCEPT_VINTR is cleared
> in vmcb01 is svm_clear_vintr(), which also purges V_IRQ_MASK.
>
> svm_clr_intercept(svm, INTERCEPT_VINTR);
>
> /* Drop int_ctl fields related to VINTR injection. */
> svm->vmcb->control.int_ctl &= ~V_IRQ_INJECTION_BITS_MASK;
>
>> Fix this by also clearing V_IRQ_INJECTION_BITS_MASK from vmcb01's
>> int_ctl in avic_deactivate_vmcb(), so that no stale dummy VINTR is
>> left behind when AVIC transitions from enabled to disabled.
>>
>> Signed-off-by: xin guo <m18700951735@xxxxxxx>
>> ---
>> arch/x86/kvm/svm/avic.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index cdd5a6dc646f..b042c3f5f90e 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -257,7 +257,9 @@ static void avic_deactivate_vmcb(struct vcpu_svm *svm)
>> {
>> struct vmcb *vmcb = svm->vmcb01.ptr;
>>
>> - vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
>> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK |
>> + V_IRQ_INJECTION_BITS_MASK);
>> +
>> vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
>>
>> if (!is_sev_es_guest(&svm->vcpu))
>> --
>> 2.27.0
>>