Re: [PATCH v1] irqchip: fix mask alignment in gic-v2m

From: Thomas Gleixner

Date: Tue Mar 24 2026 - 11:28:30 EST

On Mon, Mar 23 2026 at 13:37, Marc Zyngier wrote:
> On Sat, 21 Mar 2026 14:12:16 +0000,
> Javier Achirica <jachirica@xxxxxxxxx> wrote:
>>
>> commit 2ef3886ce626dcdab0cbc452dbbebc19f57133d8 ("irqchip/gic-v2m: Handle
>> Multiple MSI base IRQ Alignment") introduced a regression in kernel 6.12.58
>> affecting PCIe devices using GICv2m MSI on a Qualcomm (arm64) platform.
>>
>> It uses nr_irqs parameter to generate a mask to align the MSI base address,
>> but this mask isn't properly generated when nr_irqs isn't a power of two.
>> This bug was found while adding support for the TCL HH500V router in OpenWrt.
>>
>> This patch fixes the issue, can be cleanly applied to the 6.12.x tree and,
>> with a small fuzz, to 7.0.x.
>>
>> Signed-off-by: Javier Achirica <jachirica@xxxxxxxxx>
>> Cc: stable@xxxxxxxxxxxxxxx
>> ---
>> --- a/drivers/irqchip/irq-gic-v2m.c 2026-03-20 09:45:22.170192561 +0100
>> +++ b/drivers/irqchip/irq-gic-v2m.c 2026-03-20 09:45:26.284210783 +0100
>> @@ -158,7 +158,7 @@
>> struct v2m_data *v2m = NULL, *tmp;
>> int hwirq, i, err = 0;
>> unsigned long offset;
>> - unsigned long align_mask = nr_irqs - 1;
>> + unsigned long align_mask = roundup_pow_of_two(nr_irqs) - 1;
>>
>> spin_lock(&v2m_lock);
>> list_for_each_entry(tmp, &v2m_nodes, entry) {
>>
>
> This looks wrong for a bunch of reasons:
>
> - you're hacking the allocation path, but not the free path -- what
> could possibly go wrong?
>
> - nr_irqs not being a power of two to start with is more indicative of
> a bug somewhere else in the system. The only case where we allocate
> more than a single IRQ at a time is for Multi-MSI, and that is
> definitely a power-of-two construct.

Right. Though the PCI/MSI core has never enforced it.

It just ensures that the number of requested interrupts is less than or
equal the power of 2 aligned number in the Multiple Message Capable
field of the Message Control word.

It only writes back round_up_power_of_two(nvec) to the Multiple Message
Enable field and hands the non power of two aligned allocation request
(nvec) down to the domain.

x86 handles this silently under the hood forever. The IRTE allocation
rounds nvec up to the next power of two.

If the driver requests minvec = 3 and maxvec = 5 and the hardware
supports 8 pci_msi_enable_range() it will allocate 5 in the device
domain, resulting in a table size of 8 and 5 actually allocted
interrupts.

The PCI/MSI core could allocate 8 in the device domain and stay
backwards compatible by returning 5 to the caller. The downside would be
that this fully allocates 3 extra unused interrupts descriptors and
resources throughout the domain hierarchy.

It's mostly memory and the only problematic case would be affinity
managed interrupts where the over-allocation actually affects the scarse
x86 vector space. I can't tell from the top of my head whether managed
mode is actually supported with MULTI-MSI or not. It might be, but that
needs some investigation.

Can't we have nice things for once?

Thanks,

tglx