Re: [PATCH v2] irqchip/sifive-plic: Fix insufficient irq_groups allocation

From: Thomas Gleixner

Date: Tue Jan 27 2026 - 11:07:25 EST


On Wed, Jan 21 2026 at 13:45, Yangyu Chen wrote:
> Since the first irq source is 1 instead of 0, when the number of
> irqs is multiple of 32, the last irq group will be ignored during
> allocation, saving, and restoring. This lead to memory corruption
> when accessing enable_save beyond allocated memory after commit
> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state")
> which will access enable_save for all sources during plic_probe.
> Thus, we should allocate irq_groups based on (nr_irqs + 1) instead of
> nr_irqs to avoid this issue.
>
> This is an long standing bug since Linux v6.4 but since the last irq
> source is rarely used, it may not be triggered in practice until commit
> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state").

I'm absolutely not convinced that this is the right fix. The handling of
nr_irqs in this driver is completely inconsistent:

1)

static int plic_irq_suspend(void *data)
/* irq ID 0 is reserved */
for (unsigned int i = 1; i < priv->nr_irqs; i++) {

2)
static void plic_irq_resume(void *data)
/* irq ID 0 is reserved */
for (unsigned int i = 1; i < priv->nr_irqs; i++) {
...
for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {

3)

static irq_hw_number_t cp100_get_hwirq(struct plic_handler *handler, void __iomem *claim)
int nr_irq_groups = DIV_ROUND_UP(handler->priv->nr_irqs, 32);

for (i = 0; i < nr_irq_groups; i++) {
4)

static int plic_probe(struct fwnode_handle *fwnode)

priv->prio_save = bitmap_zalloc(nr_irqs, GFP_KERNEL);
...
for (int j = 0; j <= nr_irqs / 32; j++)
writel(0, enable_base + j);
...
handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),...
...
for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
...
priv->irqdomain = irq_domain_create_linear(fwnode, nr_irqs + 1,

So can the SIFIVE people please clarify once and forever what's the
actual meaning of nr_irqs:

A) The actual number of hardware interrupts available for drivers
starting from hwirq=1

B) The total number of hardware interrupts, where hwirq=0 is reserved
and therefore the number of interrupts available to drivers is
nr_irqs - 1.

That needs to be clarified first and then subsequently every single
instance which deals with nr_irqs has to be updated accordingly.

A random fix which cures the symptom of an out of bounds memory access
is not helpful at all to cure the underlying root cause of
inconsistency.

Thanks,

tglx