Re: [PATCH net-next v2 12/14] net: macb: re-read ISR inside IRQ handler locked section

From: Théo Lebrun

Date: Wed Apr 15 2026 - 10:13:46 EST


Hello Jakub's LLM,

On Tue Apr 14, 2026 at 2:52 AM CEST, Jakub Kicinski wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> net: macb: re-read ISR inside IRQ handler locked section
>
> This commit adds a second read of the ISR register inside the lock in
> macb_interrupt() to handle a race condition where the status might have
> been updated by concurrent hardware resets.
>
>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
>> index 2633234012bae..3deab87c565b5 100644
>> --- a/drivers/net/ethernet/cadence/macb_main.c
>> +++ b/drivers/net/ethernet/cadence/macb_main.c
>> @@ -2190,6 +2190,13 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
>>
>> spin_lock(&bp->lock);
>>
>> + /* `status` stack variable might be stalled => re-read it */
>> + status = queue_readl(queue, ISR);
>
> Will this second read discard interrupt events on older MACB variants that
> use clear-on-read for the ISR?
>
> If the hardware does not have the MACB_CAPS_ISR_CLEAR_ON_WRITE capability,
> the first queue_readl call before the lock already cleared the active bits.
>
> Overwriting the local status variable here sets it to 0, which permanently
> drops the initially captured Rx/Tx completions and stalls the network
> interface.

The AI is correct. For R1C, we must check ISR inside the bp->lock
critical section. For W1C, we can check it outside to detect spurious
interrupt without grabbing bp->lock and then we check again in bp->lock.

So:

static irqreturn_t macb_interrupt(int irq, void *dev_id)
{
// ...

/* detect spurious interrupts without grabbing bp->lock */
if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE) {
status = queue_readl(queue, ISR);
if (unlikely(!status))
return IRQ_NONE;
}

spin_lock(&bp->lock);

status = queue_readl(queue, ISR);
if (unlikely(!status)) {
spin_unlock(&bp->lock);
return IRQ_NONE;
}

// ...
}

Or we could drop the readl(ISR) from outside bp->lock.

>> + if (unlikely(!status)) {
>> + spin_unlock(&bp->lock);
>> + return IRQ_NONE;
>
> If the function reaches this point, the initial status read before the lock
> was non-zero, meaning the device legitimately asserted the interrupt.
>
> If a concurrent operation cleared the hardware state while this thread
> waited for the lock, will returning IRQ_NONE falsely flag this as a
> spurious interrupt?
>
> Since the interrupt was actually raised by the device, doing this repeatedly
> can cause the core IRQ subsystem to permanently disable the IRQ line.
>
> Could this return IRQ_HANDLED instead?

Disagreed on this one. This codepath, if it happens, is expected to be
infrequent and to protect against races by the swap operation disabling
the device. There is no chance we often land here and end up disabling
the IRQ line.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com