Re: [RFC PATCH net-next 1/2] net: napi: Fix interrupts permanently disabled during busy poll

From: Martin Karsten

Date: Tue Apr 28 2026 - 20:06:02 EST


On 2026-04-28 19:40, Jakub Kicinski wrote:
On Tue, 28 Apr 2026 17:51:30 +0000 Dragos Tatulea wrote:
Under certain conditions a queue can be left out with interrupts
disabled and with the napi re-scheduling timer permanently stopped.
This behaviour is triggered by the napi busy poll path when
gro-flush-timeout and defer-hard-irq are set. Here's a sequence of
operations:

1. Busy poll starts, NAPI_STATE_SCHED is set to avoid rescheduling napi
from the timer.

2. During napi poll, driver disables interrupts due to being in poll
mode (napi_complete_done() returns false because napi->state has
NAPIF_STATE_IN_BUSY_POLL set).

Why does the driver have IRQs disabled in busy poll?

The problems occurs in irq deferral mode when both gro-flush-timeout and defer-hard-irqs are nonzero and NIC interrupts are disabled.

3. At the end of the busy poll (busy_poll_stop()):
3.1 napi timer is scheduled and skip_schedule is set (due to config)
3.2 napi->poll() is called:
- driver poll() processes exactly budget packets
and exits early => napi not scheduled.
(interrupts are still disabled at this point)
3.3 Since napi poll processed budget packets, __busy_poll_stop()
is called with skip_schedule set => napi is not scheduled here
either.

with skip_schedule it calls:

clear_bit(NAPI_STATE_SCHED, &napi->state);

4. If the napi timer from 3.1 gets to be triggered due to slow napi poll
or some other reason, the timer will run with no effect (due to
NAPI_STATE_SCHED being set).

And here you claim STATE_SCHED is still set?

Labelling this with number 4. might be misleading, sorry! The concern is that a short enough timer (compared to the duration of the driver poll) can be triggered before the NAPI_STATE_SCHED bit is cleared at the end of Step 3.3.

Thanks,
Martin