Re: [PATCH net] netpoll: run NAPI poll in softirq context to avoid rq->lock self-deadlock

From: Sebastian Andrzej Siewior

Date: Thu Jun 18 2026 - 07:19:17 EST


On 2026-06-16 19:02:57 [+0200], Peter Zijlstra wrote:
> On Tue, Jun 16, 2026 at 12:35:29PM +0200, Sebastian Andrzej Siewior wrote:
>
> > So this is not an issue since commit 7eab73b18630e ("netconsole: convert
> > to NBCON console infrastructure"). Because from here now on writes are
> > deferred to the nbcon thread. So this purely about -stable in this case.
>
> Hmm, I thought netconsole had some reserved skbs and could to writes
> 'atomic' like? That said, it was 2.6 era the last time I looked at
> netconsole.

Let's look at 8250 for a second in this scenario.
serial8250_console_write() -> uart_port_lock_irqsave(). The uart lock is
a spinlock_t. lockdep does not complain because printk annotates it as
with RT we have NBCONs mandatory and don't use this path.
serial8250_console_write() -> serial8250_modem_status() does a
wake_up_interruptible(). Even if not here, it is used under the port
lock so eventually lockdep will see it and complain about rq lock vs
port lock ordering.

> > Now. The scheduler usually does printk_deferred() because of the rq lock
> > so it does not deadlock for various reasons. It is kind of a pity that
> > the various WARN macros don't do that.
>
> People have tried, last time was here:
>
> https://lkml.kernel.org/r/20260611074344.GG48970@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> and I hate deferred with a passion. It means you'll never see the
> message when you wreck the machine.

Oh, I do hate them, too. Maybe not as much because I spread my hate
evenly across the code. I did *miss* output on RT because the box
crashed before sending output so hate is here.

> > We could add printk_deferred_enter/exit() to all the rq_lock() variants.
> > I think PeterZ loves this the most. And Greg will appreciate it too
> > while backporting because of all the context changes.
>
> No, not going to happen, ever, sorry. Instead printk should delete
> console sem and have printk() itself be atomic safe.

That was not meant serious but as a possibility.

> As stated, printk deferred is an abomination and needs to die a horrible
> painful death.
>
> As described here:
>
> https://lkml.kernel.org/r/20260611191922.GK187714@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> "So printk should:
>
> - stick msg in buffer (lockless)
> - print to atomic consoles (lockless)
> - use irq_work to wake console kthreads (lockless)
> - each kthread then tries to flush buffer to its own non-atomic console
> in non-atomic context."

So we do this with nbcon afaik and this is the plan forward. The 8250 is
stuck behind broken flow control that John works tirelessly on fixing
before the 8250 can move over to the nbcon land. And some point it might
be possible to force-thread legacy consoles as we do it on RT or remove
them due to no users.

However until then and for stable I do suggest the following:

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 09e8eccee8ed9..9cba16474cb6e 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -115,6 +115,17 @@ extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
})
#endif

+#define WARN_ON_DEFERRED(condition) ({ \
+ int __ret_warn_on = !!(condition); \
+ if (unlikely(__ret_warn_on)) { \
+ printk_deferred_enter(); \
+ __WARN_FLAGS(#condition, \
+ BUGFLAG_TAINT(TAINT_WARN)); \
+ printk_deferred_exit(); \
+ } \
+ unlikely(__ret_warn_on); \
+})
+
#ifndef WARN_ON_ONCE
#define WARN_ON_ONCE(condition) ({ \
int __ret_warn_on = !!(condition); \
@@ -125,6 +136,18 @@ extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
unlikely(__ret_warn_on); \
})
#endif
+
+#define WARN_ON_ONCE_DEFERRED(condition) ({ \
+ int __ret_warn_on = !!(condition); \
+ if (unlikely(__ret_warn_on)) { \
+ printk_deferred_enter(); \
+ __WARN_FLAGS(#condition, \
+ BUGFLAG_ONCE | \
+ BUGFLAG_TAINT(TAINT_WARN)); \
+ printk_deferred_exit(); \
+ } \
+ unlikely(__ret_warn_on); \
+})
#endif /* __WARN_FLAGS */

#if defined(__WARN_FLAGS) && !defined(__WARN_printf)
@@ -159,6 +182,18 @@ extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
})
#endif

+#ifndef WARN_ON_DEFERRED
+#define WARN_ON_DEFERRED(condition) ({ \
+ int __ret_warn_on = !!(condition); \
+ if (unlikely(__ret_warn_on)) { \
+ printk_deferred_enter() \
+ __WARN(); \
+ printk_deferred_exit() \
+ } \
+ unlikely(__ret_warn_on); \
+})
+#endif
+
#ifndef WARN
#define WARN(condition, format...) ({ \
int __ret_warn_on = !!(condition); \
@@ -180,6 +215,11 @@ extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
DO_ONCE_LITE_IF(condition, WARN_ON, 1)
#endif

+#ifndef WARN_ON_ONCE_DEFERRED
+#define WARN_ON_ONCE_DEFERRED(condition) \
+ DO_ONCE_LITE_IF(condition, WARN_ON_DEFERRED, 1)
+#endif
+
#ifndef WARN_ONCE
#define WARN_ONCE(condition, format...) \
DO_ONCE_LITE_IF(condition, WARN, 1, format)
@@ -215,7 +255,9 @@ extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
})
#endif

+#define WARN_ON_DEFERRED(condition) WARN_ON(condition)
#define WARN_ON_ONCE(condition) WARN_ON(condition)
+#define WARN_ON_ONCE_DEFERRED(condition) WARN_ON(condition)
#define WARN_ONCE(condition, format...) WARN(condition, format)
#define WARN_TAINT(condition, taint, format...) WARN(condition, format)
#define WARN_TAINT_ONCE(condition, taint, format...) WARN(condition, format)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3ebec186f9823..439379e6a83de 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5814,7 +5814,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev)
/* in !on_rq case, update occurred at dequeue */
update_load_avg(cfs_rq, prev, 0);
}
- WARN_ON_ONCE(cfs_rq->curr != prev);
+ WARN_ON_ONCE_DEFERRED(cfs_rq->curr != prev);
cfs_rq->curr = NULL;
}


This plus this other occurrences in sched under rq lock.

If I replace the above WARN_ON_ONCE with
WARN_ON_ONCE(system_state >= SYSTEM_RUNNING);

then my box fails to boot. Which means the warning seems harmful as of
today. The disgusting _DEFERERED workaround gets the box to boot until
we are in nbcon land.

Sebastian