[PATCH v4 0/8] rcu: fix stuck defer_qs_pending state and add rescue timer
From: Joel Fernandes
Date: Thu Jun 25 2026 - 20:43:52 EST
This series fixes a bug where rdp->defer_qs_pending can remain stuck in
PENDING when a preempted reader's quiescent state is reported up-tree via
a path other than the deferred-QS irq-work handler (FQS scan, hotplug
transition, expedited GP IPI, context switch). Once stuck, the pending
gate in rcu_read_unlock_special() silently suppresses all future arming
attempts on that CPU. The series adds PENDING -> IDLE transitions at the
missing sites (patches 1-7), including the case where the deferred-QS
irq-work handler may run between segments of a compound section (per Paul
McKenney's counter-example) and the softirq deferred-QS arming path.
Patch 8 adds a per-CPU rescue hrtimer that bounds the worst-case
deferred-QS reporting latency: when the irq-work handler lands in a clean
(non-reader, non-compound) context it reports the quiescent state directly
via the new rcu_preempt_deferred_qs_try_report() helper, and the rescue timer
reuses the same helper so that, under preempt=none, the QS report is quick
without depending on the scheduler. The rescue timer is cancelled from the
normal deferred-QS report path so it does not fire once the quiescent state
has already been reported.
This version is rebased on top of Paul's latest rcu/dev branch. The
rcutorture reader-end deboost test patches that were folded into v3 are now
in rcu/dev and have been dropped here. The git tree below additionally
carries two debug-only commits on top of the series ([TEST COMMIT], not for
merge): a detector that WARNs if defer_qs_pending is stuck at GP cleanup,
and an rcutorture tweak that gives the async deboost mechanisms up to 500us
before warning. Applied alone on unmodified mainline, the detector reliably
fires within 5 minutes under TREE03 rcutorture; with the full fix applied I
could not reproduce the issue.
The git tree with all patches can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (tag: rcu-dqs-stuck-v4-20260625)
Change log:
Changes from v3 to v4:
- Rebased on top of Paul's latest rcu/dev branch.
- Dropped the rcutorture reader-end deboost test patches.
- Reclassified "rcutorture: give async deboost mechanisms up to 500us before
WARN" as a debug-only [TEST COMMIT] (not for merge).
- "rcu: add per-CPU rescue hrtimer for deferred-QS reporting": cancel the
rescue hrtimer from rcu_preempt_deferred_qs_irqrestore() via
hrtimer_try_to_cancel() so it no longer fires after a normal report path
has already reported the QS -- about a 90% reduction in rescue-timer fires
under TREE03 with rcutorture.gp_exp=1.
- Patches 1-7 are unchanged from v3.
Changes from v2 to v3:
- Folded in the rcutorture "reader-end deboost testing" patches (three from
Paul, two from me), previously posted separately as an RFC, so the fix
and its test coverage can be reviewed together:
https://lore.kernel.org/all/20260616222622.2981876-1-joelagnelf@xxxxxxxxxx/
- New patch "rcu: add per-CPU rescue hrtimer for deferred-QS reporting" to
bound the worst-case deferred-QS reporting latency.
- New patch "rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0".
- Reworked "rcu: clear defer_qs_pending in handler for compounded sections":
the irq-work handler now reports the deferred QS directly via the new
rcu_preempt_deferred_qs_try_report() helper when it lands in a clean
context, instead of only nudging the scheduler.
Changes from v1 to v2:
- Dropped RFC tag now that softirq paths have been investigated.
- Added new patch "rcu: set need_resched on softirq deferred-QS arming
path" to handle the softirq arming case that was deferred in v1.
Link to v3: https://lore.kernel.org/all/20260618185030.376450-1-joelagnelf@xxxxxxxxxx/
Link to v2: https://lore.kernel.org/all/20260526225014.314734-1-joelagnelf@xxxxxxxxxx/
Link to v1: https://lore.kernel.org/all/20260522142342.1536533-1-joelagnelf@xxxxxxxxxx/
Joel Fernandes (8):
rcu: introduce rcu_defer_qs_clear() helper
rcu: clear defer_qs_pending when notifying GP changes
rcu: clear defer_qs_pending in handler for compounded sections
rcu: drop redundant defer_qs_pending clear in irqrestore handler
rcu: clear defer_qs_pending at expedited IPI entry
rcu: set need_resched on softirq deferred-QS arming path
rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0
rcu: add per-CPU rescue hrtimer for deferred-QS reporting
kernel/rcu/tree.c | 3 +
kernel/rcu/tree.h | 6 ++
kernel/rcu/tree_exp.h | 6 ++
kernel/rcu/tree_plugin.h | 149 ++++++++++++++++++++++++++++++++++-----
4 files changed, 147 insertions(+), 17 deletions(-)
base-commit: 47e26f0fd70890ddd810887a043303a365a8bf03
--
2.34.1