[PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path
From: Breno Leitao
Date: Thu May 07 2026 - 07:19:38 EST
The caller of __queue_work() owns WORK_STRUCT_PENDING, won via
test_and_set_bit() in queue_work_on()/__queue_delayed_work(). The
state machine documented above __queue_work() requires that owner
to either hand the token to a pwq (insert_work() -> set_work_pwq()),
hand it to a timer, or release it via set_work_pool_and_clear_pending().
try_to_grab_pending() relies on this: when it observes
"PENDING && off-queue" it busy-loops, trusting the current owner to
make progress.
The (__WQ_DESTROYING | __WQ_DRAINING) early-return path violates that
contract. It WARN_ONCE()s and bare-returns, leaving work->data with
PENDING set, WORK_STRUCT_PWQ clear, and work->entry empty.
The path is reachable without explicit API abuse: queue_delayed_work()
arms a timer with PENDING set; if drain_workqueue() runs while the
timer is still pending, delayed_work_timer_fn() -> __queue_work() in
softirq context hits the WARN, current is not a wq worker so
is_chained_work() is false, and the work is silently dropped with
PENDING leaked.
Mirror what clear_pending_if_disabled() already does on its analogous
reject path: unpack the off-queue data and call
set_work_pool_and_clear_pending() to release the token before
returning.
I was able to reproduce this by queueing several slow works on
a max_active=1 wq, arm a delayed_work whose timer fires while
drain_workqueue() is blocked, then call cancel_delayed_work_sync().
Without this patch the cancel livelocks at 100% CPU; with it the cancel
returns immediately.
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
---
not sure you want to have a Fixes tag, but, if you do, I would point it
to Fixes: e41e704bc4f4 ("workqueue: improve destroy_workqueue() debuggability")
---
kernel/workqueue.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2506b5cfbb133..885be263b2825 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2296,6 +2296,18 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
if (unlikely(wq->flags & (__WQ_DESTROYING | __WQ_DRAINING) &&
WARN_ONCE(!is_chained_work(wq), "workqueue: cannot queue %ps on wq %s\n",
work->func, wq->name))) {
+ struct work_offq_data offqd;
+
+ /*
+ * State on entry: PENDING is set, work is off-queue (no
+ * insert_work() has run).
+ *
+ * Returning without clearing PENDING would leave the work
+ * in a weird state (PENDING=1, PWQ=0, entry empty)
+ */
+ work_offqd_unpack(&offqd, *work_data_bits(work));
+ set_work_pool_and_clear_pending(work, offqd.pool_id,
+ work_offqd_pack_flags(&offqd));
return;
}
rcu_read_lock();
---
base-commit: 735d2f48cadaa9a87e7c7601667878de70c771c5
change-id: 20260507-workqueue_pending-b91beb94ef46
Best regards,
--
Breno Leitao <leitao@xxxxxxxxxx>