Re: MSIs not freed in GICv3 ITS driver
From: Marc Zyngier
Date: Thu Feb 26 2026 - 08:45:49 EST
On Wed, 25 Feb 2026 09:34:41 +0000,
Qiang Yu <qiang.yu@xxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Feb 19, 2026 at 04:54:29PM +0000, Marc Zyngier wrote:
> > On Fri, 16 Jan 2026 15:03:33 +0000,
> > Manivannan Sadhasivam <mani@xxxxxxxxxx> wrote:
> > >
> > > Hi Marc,
> > >
> > > Looks like this has fallen through the cracks and my colleage internally
> > > reported a warning during the removal of a PCI driver and it seems to be related
> > > to the issue we were discussing in this thread:
> > >
> > > [ 54.727284] WARNING: drivers/irqchip/irq-gic-v3-its.c:3639 at its_msi_teardown+0x11c/0x13c, CPU#4: kworker/u73:1/115
> > > [ 54.738366] Modules linked in: mhi_pci_generic mhi nvme_core usb_f_fs libcomposite sm3_ce nvmem_qcom_spmi_sdam qcom_pon rtc_pm8xxx qcom_spmi_temp_alarm qcom_stats dispcc_glymur gpi llcc_qcom phy_qcom_qmp_pcie qcom_cpucp_mbox qcom_wdt socinfo
> > > [ 54.760588] CPU: 4 UID: 0 PID: 115 Comm: kworker/u73:1 Tainted: G W 6.18.0-next-20251210-14099-gc20082c23661-dirty #2 PREEMPT
> > > [ 54.774067] Tainted: [W]=WARN
> > > [ 54.777412] Hardware name: Qualcomm MTP/Qualcomm Test Device, BIOS 7.0.251121.BOOT.OSSUEFI.3.1-00008-GLYMUR-1 11/21/2025
> > > [ 54.788849] Workqueue: async async_run_entry_fn
> > > [ 54.793791] pstate: 21400009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > > [ 54.801230] pc : its_msi_teardown+0x11c/0x13c
> > > [ 54.805997] lr : its_msi_teardown+0x54/0x13c
> > > [ 54.810675] sp : ffff8000837cb710
> > > [ 54.814373] x29: ffff8000837cb710 x28: ffff00080190e410 x27: ffff0008085ba390
> > > [ 54.821985] x26: ffff000808629bf0 x25: 0000000000000000 x24: 0000000000000066
> > > [ 54.829602] x23: 0000000000000007 x22: 0000000000000020 x21: ffff000800059608
> > > [ 54.837209] x20: ffff000800059607 x19: ffff000800a4a300 x18: 00000000ffffffff
> > > [ 54.844819] x17: ffff00080ec65400 x16: ffff00080ec65200 x15: ffff00080ec65000
> > > [ 54.852429] x14: 0000000000000004 x13: ffff0008000b8810 x12: 0000000000000000
> > > [ 54.860046] x11: ffff0008007798e8 x10: 0000000000000002 x9 : 0000000000000001
> > > [ 54.867661] x8 : ffff0008007796f8 x7 : 000000000000001f x6 : ffff8000837cb640
> > > [ 54.875277] x5 : ffff000801918f40 x4 : 0000000000000007 x3 : 0000000000000000
> > > [ 54.882891] x2 : ffff000800a037c0 x1 : 0000000000000020 x0 : 0000000000000007
> > > [ 54.890509] Call trace:
> > > [ 54.893320] its_msi_teardown+0x11c/0x13c (P)
> > > [ 54.898082] its_msi_teardown+0x34/0x44
> > > [ 54.902316] msi_remove_device_irq_domain+0x70/0x114
> > > [ 54.907701] msi_device_data_release+0x20/0x64
> > > [ 54.912551] devres_release_all+0xa4/0x104
> >
> > That's nowhere near enough information for me to do anything about it.
> >
> > Unless you describe exactly what device this is, its allocation
> > requirements, the topology of the system and finally reproduce it on a
> > vanilla kernel and not something that I have no access to, I can't do
> > much for you.
>
> Hi Marc,
>
> Thanks for the feedback. I can reproduce this issue with latest linux-next
> tag next-20260224.
Please don't test on -next. Pick the latest tag from Linus. As far as
I am concerned, -next bears no relevance whatsoever.
>
> The host is Glymur (Qualcomm compute platform) with an SDX75 modem
> connected via PCIe. The SDX75 driver requests 7 MSI IRQs, and the warning
> triggers during driver removal.
>
> I think this is actually a common problem with how we handle
> MSI allocation vs freeing. Here's what I'm seeing:
>
> When allocating, irq_domain_alloc_irqs_hierarchy() makes one call to
> domain->ops->alloc() with nr_irqs=7. The MSI controller (ITS in this case
> but DWC-MSI has similar behavior) finds a power-of-2 bits in its bitmap
> region, so it allocates 8 contiguous bits to satisfy the 7 IRQ request.
Well, it's not like the ITS has a choice. Given that the ITT size is
expressed in a number of bits, you get the choice between a power of
two or absolutely nothing.
I'm not going to comment on the DWC stuff, as it has been bitrotting
for the best part of two decades.
>
> But when freeing, irq_domain_free_irqs_hierarchy() loops and calls
> domain->ops->free() seven times, each with nr_irqs=1. So we end up freeing
> 7 individual bits instead of the original 8 bits that was allocated.
>
> This allocation/free mismatch seems to corrupt the bitmap tracking, which
> is what triggers the warning in its_msi_teardown().
>
> I suspect this would happen with any PCIe device that requests a
> non-power-of-2 number of MSI IRQs on systems using ITS or DWC-MSI.
Is this device doing Multi-MSI or MSI-X? Please post an 'lspci -vv' so
that we know what we are up against.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.