[PATCH RFC 0/3] workqueue: improve stall diagnostics for pools with no running worker

From: Breno Leitao

Date: Tue Jun 16 2026 - 12:51:03 EST

The workqueue watchdog fires when a pool stops making progress, but the
diagnostics it printed did not always point at the culprit.

Commit 8823eaef45da7 ("workqueue: Show all busy workers in stall
diagnostics") made the watchdog dump every in-flight worker in the
stalled pool's busy_hash, including workers that are not running on the
CPU. As Petr Mladek pointed out, that is rarely useful: a worker that is
merely sleeping inside a work item does not, by itself, hold up the pool
-- the scheduler calls wq_worker_sleeping() and the pool wakes (or forks)
another worker to keep going. Dumping all of those sleeping workers
mostly adds noise and can do more harm than good.

The condition actually worth reporting is the opposite one: a pool that
is stalled with *no* running worker at all. That means the pool failed
to get a worker onto the CPU -- it could not wake an idle worker, could
not fork a new one, or the CPU is busy running something that is not a
workqueue worker. In that case the previous code printed an empty
backtrace section and gave no hint about what went wrong.

Following Petr's suggestion, this series reworks the diagnostics:

1) workqueue: only show running workers in stall diagnostics
Restore the task_is_running() filter (reverting the behaviour of
8823eaef45da7) so only running workers are dumped, and explicitly
report when a stalled pool has no running worker, printing the pool
id, CPU, idle state and worker counts.

2) workqueue: trigger a single-CPU backtrace for stalled pools
Trigger a backtrace of the stalled CPU so whatever task is actually
occupying it -- workqueue worker or not -- is captured.

3) workqueue: dump the last woken worker for stalled pools
Record the worker last woken by kick_pool() and dump its backtrace.
It is the prime suspect: the idle worker kicked to take over when
the previous running worker went to sleep.

The reworked diagnostics have been running on the Meta fleet (backported
to 6.16) and finally explained a long-standing arm64 stall that the old
output could not: an EFI runtime call wedging an efi_rts_wq worker on a
machine without NMI. The single-CPU backtrace from patch 2 pinpointed
efi_call_rts() as the stuck task [1].

Example output, reproduced with the in-tree stall detector sample
(samples/workqueue/stall_detector/wq_stall.c); the lines added by this
series are marked "<-- new":

BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 30s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x100
pwq 10: cpus=2 node=0 flags=0x0 nice=0 active=5 refcnt=6
in-flight: 58:stall_work1_fn [wq_stall] for 30s
pending: stall_work2_fn [wq_stall], ...
pool 10: cpus=2 node=0 flags=0x0 nice=0 hung=30s workers=2 idle: 33
Showing backtraces of busy workers in stalled worker pools:
pool 10: no worker in running state, cpu=2 is idle (nr_workers=2 nr_idle=1) <-- new
The pool might have trouble waking an idle worker. <-- new
Backtrace of last woken worker: <-- new
task:kworker/2:1 state:I pid:58 <-- new
__schedule+0x8fd/0xfc0 <-- new
stall_work1_fn+0xb2/0x100 [wq_stall] <-- new
process_scheduled_works+0x254/0x4e0 <-- new
worker_thread+0x222/0x340 <-- new
Sending NMI from CPU 0 to CPUs 2: <-- new
NMI backtrace for cpu 2 (idle: default_idle / do_idle) <-- new

[1] https://lore.kernel.org/all/20260616-efi_timeout-v3-0-76dd1d26657b@xxxxxxxxxx/

Cc; marco.crivellari@xxxxxxxx

Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
---
Breno Leitao (3):
workqueue: only show running workers in stall diagnostics
workqueue: trigger a single-CPU backtrace for stalled pools
workqueue: dump the last woken worker for stalled pools

kernel/workqueue.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 70 insertions(+), 5 deletions(-)
---
base-commit: 4fa3f5fabb30bf00d7475d5a33459ea83d639bf9
change-id: 20260616-wq_dump_petr-7fcf43940204

Best regards,
--
Breno Leitao <leitao@xxxxxxxxxx>