[PATCH v3 00/14] rcu: fix stuck defer_qs_pending state, add rescue timer and torture tests

From: Joel Fernandes

Date: Thu Jun 18 2026 - 14:51:24 EST


This series fixes a bug where rdp->defer_qs_pending can remain stuck in
PENDING when a preempted reader's quiescent state is reported up-tree via
a path other than the deferred-QS irq-work handler (FQS scan, hotplug
transition, expedited GP IPI, context switch). Once stuck, the pending
gate in rcu_read_unlock_special() silently suppresses all future arming
attempts on that CPU. The series adds PENDING -> IDLE transitions at the
missing sites (patches 1-7), including the case where the deferred-QS
irq-work handler may run between segments of a compound section (per Paul
McKenney's counter-example) and the softirq deferred-QS arming path.

Patch 8 adds a per-CPU rescue hrtimer that bounds the worst-case
deferred-QS reporting latency: when the irq-work handler lands in a clean
(non-reader, non-compound) context it reports the quiescent state directly
via the new rcu_preempt_deferred_qs_try_report() helper, and the rescue timer
reuses the same helper so that, under preempt=none, the QS report is quick
without depending on the scheduler.

Patches 9-13 add rcutorture coverage for the reader-end deboost behavior
(three from Paul, two from me). These were previously posted on their own
as an RFC; they are folded in here so the fix and its test coverage can be
reviewed together.

The last patch is a debug-only detector (CONFIG_RCU_GP_CLEANUP_STALE_CHECK,
marked [TEST COMMIT], not for merge) -- applied alone on unmodified
mainline without the fixes it reliably fires a WARN within 5 minutes under
TREE03 rcutorture, confirming the bug exists and the detector catches it;
with the full fix applied, I could not reproduce the issue.

The git tree with all patches can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (tag: rcu-dqs-stuck-v3-20260618)

Change log:

Changes from v2 to v3:
- Folded in the rcutorture "reader-end deboost testing" patches (three from
Paul, two from me), previously posted separately as an RFC, so the fix
and its test coverage can be reviewed together:
https://lore.kernel.org/all/20260616222622.2981876-1-joelagnelf@xxxxxxxxxx/
- New patch "rcu: add per-CPU rescue hrtimer for deferred-QS reporting" to
bound the worst-case deferred-QS reporting latency.
- New patch "rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0".
- Reworked "rcu: clear defer_qs_pending in handler for compounded sections":
the irq-work handler now reports the deferred QS directly via the new
rcu_preempt_deferred_qs_try_report() helper when it lands in a clean
context, instead of only nudging the scheduler.

Changes from v1 to v2:
- Dropped RFC tag now that softirq paths have been investigated.
- Added new patch "rcu: set need_resched on softirq deferred-QS arming
path" to handle the softirq arming case that was deferred in v1.

Link to v2: https://lore.kernel.org/all/20260526225014.314734-1-joelagnelf@xxxxxxxxxx/
Link to v1: https://lore.kernel.org/all/20260522142342.1536533-1-joelagnelf@xxxxxxxxxx/

Joel Fernandes (11):
rcu: introduce rcu_defer_qs_clear() helper
rcu: clear defer_qs_pending when notifying GP changes
rcu: clear defer_qs_pending in handler for compounded sections
rcu: drop redundant defer_qs_pending clear in irqrestore handler
rcu: clear defer_qs_pending at expedited IPI entry
rcu: set need_resched on softirq deferred-QS arming path
rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0
rcu: add per-CPU rescue hrtimer for deferred-QS reporting
rcutorture: tighten boost-WARN to exclude any implicit-reader context
rcutorture: give async deboost mechanisms up to 500us before WARN
[TEST COMMIT] rcu: detect stuck defer_qs_pending at GP cleanup

Paul E. McKenney (3):
rcutorture: Abstract reader-segment dump into
rcu_torture_dump_read_segs()
rcutorture: Check for immediate deboosting at reader end
rcutorture: Test RCU readers from hardware interrupt handlers

kernel/rcu/Kconfig.debug | 11 ++
kernel/rcu/rcu.h | 7 ++
kernel/rcu/rcutorture.c | 257 +++++++++++++++++++++++++++------------
kernel/rcu/tree.c | 50 ++++++++
kernel/rcu/tree.h | 14 +++
kernel/rcu/tree_exp.h | 6 +
kernel/rcu/tree_plugin.h | 169 ++++++++++++++++++++++---
7 files changed, 419 insertions(+), 95 deletions(-)


base-commit: 95c7d025cc8c3c6c41206e2a18332eb04878b7ef
--
2.34.1