Re: MSIs not freed in GICv3 ITS driver

From: Manivannan Sadhasivam

Date: Tue Mar 03 2026 - 04:31:10 EST


On Thu, Feb 26, 2026 at 01:39:35PM +0000, Marc Zyngier wrote:
> On Wed, 25 Feb 2026 09:34:41 +0000,
> Qiang Yu <qiang.yu@xxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Feb 19, 2026 at 04:54:29PM +0000, Marc Zyngier wrote:
> > > On Fri, 16 Jan 2026 15:03:33 +0000,
> > > Manivannan Sadhasivam <mani@xxxxxxxxxx> wrote:
> > > >
> > > > Hi Marc,
> > > >
> > > > Looks like this has fallen through the cracks and my colleage internally
> > > > reported a warning during the removal of a PCI driver and it seems to be related
> > > > to the issue we were discussing in this thread:
> > > >
> > > > [ 54.727284] WARNING: drivers/irqchip/irq-gic-v3-its.c:3639 at its_msi_teardown+0x11c/0x13c, CPU#4: kworker/u73:1/115
> > > > [ 54.738366] Modules linked in: mhi_pci_generic mhi nvme_core usb_f_fs libcomposite sm3_ce nvmem_qcom_spmi_sdam qcom_pon rtc_pm8xxx qcom_spmi_temp_alarm qcom_stats dispcc_glymur gpi llcc_qcom phy_qcom_qmp_pcie qcom_cpucp_mbox qcom_wdt socinfo
> > > > [ 54.760588] CPU: 4 UID: 0 PID: 115 Comm: kworker/u73:1 Tainted: G W 6.18.0-next-20251210-14099-gc20082c23661-dirty #2 PREEMPT
> > > > [ 54.774067] Tainted: [W]=WARN
> > > > [ 54.777412] Hardware name: Qualcomm MTP/Qualcomm Test Device, BIOS 7.0.251121.BOOT.OSSUEFI.3.1-00008-GLYMUR-1 11/21/2025
> > > > [ 54.788849] Workqueue: async async_run_entry_fn
> > > > [ 54.793791] pstate: 21400009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > > > [ 54.801230] pc : its_msi_teardown+0x11c/0x13c
> > > > [ 54.805997] lr : its_msi_teardown+0x54/0x13c
> > > > [ 54.810675] sp : ffff8000837cb710
> > > > [ 54.814373] x29: ffff8000837cb710 x28: ffff00080190e410 x27: ffff0008085ba390
> > > > [ 54.821985] x26: ffff000808629bf0 x25: 0000000000000000 x24: 0000000000000066
> > > > [ 54.829602] x23: 0000000000000007 x22: 0000000000000020 x21: ffff000800059608
> > > > [ 54.837209] x20: ffff000800059607 x19: ffff000800a4a300 x18: 00000000ffffffff
> > > > [ 54.844819] x17: ffff00080ec65400 x16: ffff00080ec65200 x15: ffff00080ec65000
> > > > [ 54.852429] x14: 0000000000000004 x13: ffff0008000b8810 x12: 0000000000000000
> > > > [ 54.860046] x11: ffff0008007798e8 x10: 0000000000000002 x9 : 0000000000000001
> > > > [ 54.867661] x8 : ffff0008007796f8 x7 : 000000000000001f x6 : ffff8000837cb640
> > > > [ 54.875277] x5 : ffff000801918f40 x4 : 0000000000000007 x3 : 0000000000000000
> > > > [ 54.882891] x2 : ffff000800a037c0 x1 : 0000000000000020 x0 : 0000000000000007
> > > > [ 54.890509] Call trace:
> > > > [ 54.893320] its_msi_teardown+0x11c/0x13c (P)
> > > > [ 54.898082] its_msi_teardown+0x34/0x44
> > > > [ 54.902316] msi_remove_device_irq_domain+0x70/0x114
> > > > [ 54.907701] msi_device_data_release+0x20/0x64
> > > > [ 54.912551] devres_release_all+0xa4/0x104
> > >
> > > That's nowhere near enough information for me to do anything about it.
> > >
> > > Unless you describe exactly what device this is, its allocation
> > > requirements, the topology of the system and finally reproduce it on a
> > > vanilla kernel and not something that I have no access to, I can't do
> > > much for you.
> >
> > Hi Marc,
> >
> > Thanks for the feedback. I can reproduce this issue with latest linux-next
> > tag next-20260224.
>
> Please don't test on -next. Pick the latest tag from Linus. As far as
> I am concerned, -next bears no relevance whatsoever.
>
> >
> > The host is Glymur (Qualcomm compute platform) with an SDX75 modem
> > connected via PCIe. The SDX75 driver requests 7 MSI IRQs, and the warning
> > triggers during driver removal.
> >
> > I think this is actually a common problem with how we handle
> > MSI allocation vs freeing. Here's what I'm seeing:
> >
> > When allocating, irq_domain_alloc_irqs_hierarchy() makes one call to
> > domain->ops->alloc() with nr_irqs=7. The MSI controller (ITS in this case
> > but DWC-MSI has similar behavior) finds a power-of-2 bits in its bitmap
> > region, so it allocates 8 contiguous bits to satisfy the 7 IRQ request.
>
> Well, it's not like the ITS has a choice. Given that the ITT size is
> expressed in a number of bits, you get the choice between a power of
> two or absolutely nothing.
>

But the underlying issue is that, ITS (maybe other MSI controller drivers) are
not freeing *all* of their requested IRQs.

I tried reproducing this issue with QEMU:

1. Modified the EDU driver in QEMU to support 8 MSIs:

```
diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index cece633e11..95b658ef33 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -373,7 +373,7 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp)

pci_config_set_interrupt_pin(pci_conf, 1);

- if (msi_init(pdev, 0, 1, true, false, errp)) {
+ if (msi_init(pdev, 0, 8, true, false, errp)) {
return;
}
```

2. Then I wrote a simple driver to request 3 IRQs using pci_alloc_irq_vectors()
and loaded it:

00:04.0 Unclassified device [00ff]: Device 1234:11e8 (rev 10)
Subsystem: Red Hat, Inc. Device 1100
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 25
Region 0: Memory at 10200000 (32-bit, non-prefetchable) [size=1M]
Capabilities: [40] MSI: Enable+ Count=4/8 Maskable- 64bit+
Address: 0000000008090040 Data: 0000
Kernel driver in use: edu

3. Rmmoding the driver triggers the below warning (which is same as Qiang
reported on Qcom platform):

[ 138.082682] ------------[ cut here ]------------
[ 138.082797] WARNING: drivers/irqchip/irq-gic-v3-its.c:3639 at its_msi_teardown+0x150/0x190, CPU#0: rmmod/739
[ 138.083617] Modules linked in: edu(OE-) virtio_net aes_ce_blk aes_ce_cipher ghash_ce sm4 gpio_keys xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 virtio_pci xfrm_user xfrm_algo virtio_pci_legacy_dev rtc_pl031 virtio_pci_modern_dev xt_addrtype nft_compat x_tables nf_tables br_netfilter bridge stp llc vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock overlay qrtr binfmt_misc efi_pstore sch_fq_codel libcomposite nfnetlink qemu_fw_cfg autofs4
[ 138.085222] CPU: 0 UID: 0 PID: 739 Comm: rmmod Tainted: G OE 6.19.0-rc1+ #26 PREEMPT(voluntary)
[ 138.085414] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 138.085522] Hardware name: linux,dummy-virt (DT)
[ 138.085712] pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 138.085872] pc : its_msi_teardown+0x150/0x190
[ 138.085974] lr : its_msi_teardown+0x6c/0x190
[ 138.086073] sp : ffff800080903a90
[ 138.086156] x29: ffff800080903ac0 x28: ffff000016e8c200 x27: 0000000000000000
[ 138.086352] x26: 0000000000000000 x25: 0000000000000000 x24: ffffb1373e977ba0
[ 138.086509] x23: ffffb1373eebeeb0 x22: 0000000000000008 x21: ffff000002e21a07
[ 138.086671] x20: ffff000002e21a08 x19: ffff000004895e00 x18: ffff800080573098
[ 138.086828] x17: 0000000000000000 x16: 0000000000000000 x15: ffff000004f5b400
[ 138.086984] x14: ffff000004f0a200 x13: 0000000000000000 x12: 0000000000000000
[ 138.087142] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffb1373e430234
[ 138.087297] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 138.087452] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 138.087595] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000003
[ 138.087791] Call trace:
[ 138.087954] its_msi_teardown+0x150/0x190 (P)
[ 138.088105] its_msi_teardown+0x40/0x70
[ 138.088196] msi_remove_device_irq_domain+0x84/0x128
[ 138.088289] msi_device_data_release+0x2c/0xa0
[ 138.088370] release_nodes+0x70/0x138
[ 138.088443] devres_release_all+0xa0/0x120
[ 138.088522] device_unbind_cleanup+0x24/0x98
[ 138.088614] device_release_driver_internal+0x238/0x2f8
[ 138.088710] driver_detach+0x58/0xc0
[ 138.088782] bus_remove_driver+0x80/0x140
[ 138.088859] driver_unregister+0x3c/0xa0
[ 138.088935] pci_unregister_driver+0x30/0xc0
[ 138.089021] edu_exit+0x28/0xa8 [edu]
[ 138.089280] __arm64_sys_delete_module+0x1e4/0x398
[ 138.089388] invoke_syscall.constprop.0+0x68/0x108
[ 138.089497] el0_svc_common.constprop.0+0x44/0x140
[ 138.089591] do_el0_svc+0x28/0x58
[ 138.089661] el0_svc+0x44/0x230
[ 138.089731] el0t_64_sync_handler+0xc0/0x110
[ 138.089815] el0t_64_sync+0x1b8/0x1c0
[ 138.089975] ---[ end trace 0000000000000000 ]---

> I'm not going to comment on the DWC stuff, as it has been bitrotting
> for the best part of two decades.
>

The above issue should be applicable to other MSI controller drivers as well,
not just DWC.

- Mani

--
மணிவண்ணன் சதாசிவம்