Re: [RFC PATCH] genirq: Enforce monotonic increase contract in irq_get_next_irq()
From: Thomas Gleixner
Date: Thu Jun 04 2026 - 05:25:52 EST
On Wed, Jun 03 2026 at 22:01, Aaron Tomlin wrote:
> When an IRQ descriptor is corrupted in memory (e.g., via an out-of-bounds
> write by a rogue driver), the descriptor's internal IRQ number may be
> zeroed out.
Which means the system integrity is compromised.
> During iteration via for_each_active_irq(), irq_get_next_irq() relies on
> irq_desc_get_irq(desc) to retrieve the next IRQ number. If a descriptor is
> corrupted, this can result in returning an IRQ number (e.g., 0) that is
> strictly less than the requested offset. This breaks the fundamental
> forward-progress guarantee of the iterator.
>
> This contract violation causes catastrophic unsigned integer underflows in
> callers. For instance, show_all_irqs() in fs/proc/stat.c calculates
> padding using (i - next). A corrupted descriptor returning 0 forces a
> massive unsigned underflow, trapping the CPU in an extensive loop inside
> show_irq_gap() and triggering a soft lockup watchdog.
>
> While the underlying issue is a memory corruption bug, core iterators
> should be resilient against returning values that violate their own
> mathematical boundaries and induce lockups in other subsystems.
Seriously?
If memory is corrupted and corruption is detected, then the only
sensible thing is to panic the machine and not papering over in a
particular instance and hope that this is the only side effect.
Thanks,
tglx