Re: [RFC PATCH net-next 1/2] net: napi: Fix interrupts permanently disabled during busy poll
From: Martin Karsten
Date: Tue Apr 28 2026 - 20:06:02 EST
On 2026-04-28 19:40, Jakub Kicinski wrote:
On Tue, 28 Apr 2026 17:51:30 +0000 Dragos Tatulea wrote:
Under certain conditions a queue can be left out with interrupts
disabled and with the napi re-scheduling timer permanently stopped.
This behaviour is triggered by the napi busy poll path when
gro-flush-timeout and defer-hard-irq are set. Here's a sequence of
operations:
1. Busy poll starts, NAPI_STATE_SCHED is set to avoid rescheduling napi
from the timer.
2. During napi poll, driver disables interrupts due to being in poll
mode (napi_complete_done() returns false because napi->state has
NAPIF_STATE_IN_BUSY_POLL set).
Why does the driver have IRQs disabled in busy poll?
The problems occurs in irq deferral mode when both gro-flush-timeout and defer-hard-irqs are nonzero and NIC interrupts are disabled.
3. At the end of the busy poll (busy_poll_stop()):
3.1 napi timer is scheduled and skip_schedule is set (due to config)
3.2 napi->poll() is called:
- driver poll() processes exactly budget packets
and exits early => napi not scheduled.
(interrupts are still disabled at this point)
3.3 Since napi poll processed budget packets, __busy_poll_stop()
is called with skip_schedule set => napi is not scheduled here
either.
with skip_schedule it calls:
clear_bit(NAPI_STATE_SCHED, &napi->state);
4. If the napi timer from 3.1 gets to be triggered due to slow napi poll
or some other reason, the timer will run with no effect (due to
NAPI_STATE_SCHED being set).
And here you claim STATE_SCHED is still set?
Labelling this with number 4. might be misleading, sorry! The concern is that a short enough timer (compared to the duration of the driver poll) can be triggered before the NAPI_STATE_SCHED bit is cleared at the end of Step 3.3.
Thanks,
Martin