Re: [PATCH v3] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression

From: David Woodhouse

Date: Wed Dec 03 2025 - 08:32:14 EST


On Wed, 2025-12-03 at 14:10 +0100, Paolo Bonzini wrote:
> On Wed, Dec 3, 2025 at 1:26 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, 2025-12-03 at 00:50 +0000, Huang, Kai wrote:
> > >
> > > > -#define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
> > > > -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
> > > > +#define KVM_X2APIC_API_USE_32BIT_IDS                       (_BITULL(0))
> > > > +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK             (_BITULL(1))
> > > > +#define KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST   (_BITULL(2))
> > > > +#define KVM_X2APIC_DISABLE_SUPPRESS_EOI_BROADCAST  (_BITULL(3))
> > >
> > > I hate to say, but wants to ask again:
> > >
> > > Since it's uAPI, are we expecting the two flags to have impact on in-kernel
> > > ioapic?
> > >
> > > I think there should no harm to make the two also apply to in-kernel ioapic.
> > >
> > > E.g., for now we can reject KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag for
> > > in-kernel ioapic.  In the future, we might add EOI register support to in-kernel
> > > ioapic and report supporting suppress EOI broadcast, then we can in-kernel
> > > ioapic to honor these two flags too.
> >
> > I don't think we should leave that to the unspecified 'future'. Let's
> > fix the kernel I/O APIC to support the directed EOI at the same time,
> > rather than having an interim version of KVM which supports the
> > broadcast suppression but *not* the explicit EOI that replaces it.
> >
> > Since I happened to have the I/O APIC PDFs in my recent history for
> > other reasons, and implemented these extra registers for version 0x20
> > in another userspace VMM within living memory, I figured I could try to
> > help with the actual implementation (untested, below).
> >
> > There is some bikeshedding to be done on precisely *how* ->version_id
> > should be set. Maybe we shouldn't have the ->version_id field, and
> > should just check kvm->arch.suppress_eoi_broadcast to see which version
> > to report?
>
> That would make it impossible to use the fixed implementation on the
> local APIC side, without changing the way the IOAPIC appears to the
> guest.

Yes, but remember that "the fixed implementation on the local APIC
side" means precisely that it's fixed to *not* broadcast the EOI. Which
means you absolutely *need* to have an I/O APIC capable of receiving
the explicit directed EOI, or the EOI will never happen at all.

Which is why it probably makes sense to drop the 'version_id' field
from the struct where I'd added it, and just make the code report a
hard-coded version based on suppress_eoi_broadcast being enabled:

(kvm->arch.suppress_eoi_broadcast == KVM_SUPPRESS_EOI_ENABLED) ? 0x20: 0x11

So yes, it's a guest-visible change, but only if the VMM explicitly
*asks* for the broadcast suppression feature to work, in which case
it's *necessary* anyway.


> There are no parameters that you can use in KVM_CREATE_IRQCHIP,
> unfortunately, and no checks that (for example) kvm_irqchip.pad or
> kvm_ioapic_state.pad are zero.
>
> The best possibility that I can think of, is to model it like KVM_CAP_XSAVE2
>
> 1) Add a capability KVM_CAP_IRQCHIP2 (x86 only)
>
> 2) If reported, kvm_irqchip.pad becomes "flags" (a set of flag bits)
> and kvm_ioapic_state.pad becomes version_id when returned from
> KVM_GET_IRQCHIP. Using an anonymous union allows adding the synonyms.
>
> 3) On top of this, KVM_SET_IRQCHIP2 is added which checks that
> kvm_irqchip.flags is zero and that kvm_ioapic_state.version_id is
> either 0x11 or 0x20.
>
> 4) Leave the default to 0x11 for backwards compatibility.
>
> The alternative is to add KVM_ENABLE_CAP(KVM_CAP_IRQCHIP2) but I
> dislike adding another stateful API.

Yeah. Just gate it on the existing (well, nascent)
KVM_X2APIC_ENABLE_SUPPRESS_EOI_BROADCAST flag.

Attachment: smime.p7s
Description: S/MIME cryptographic signature