[PATCH v1 00/11] RCU: Enable callbacks to benefit from expedited grace periods

From: Puranjay Mohan

Date: Wed Jun 24 2026 - 09:25:51 EST


This series lets call_rcu() callbacks be reclaimed as soon as either a
normal or an expedited grace period that covers them has elapsed, rather
than always waiting for a normal grace period.

Motivation
==========
Today there is an asymmetry: synchronize_rcu_expedited() callers get fast
reclaim, but call_rcu() callers never benefit from those same expedited
grace periods, even though an expedited GP proves exactly the same thing
as a normal one -- all pre-existing readers are done. When expedited GPs
are running on the system (driven by other subsystems), call_rcu()
callbacks that could already be freed instead sit in RCU_WAIT_TAIL until
the next normal GP. This series treats a grace period as a grace period
regardless of how it was driven, so memory is reclaimed sooner.

Design
======
Callback segments now record both the normal and expedited grace-period
sequence in struct rcu_gp_seq, and rcu_segcblist_advance() releases a
segment as soon as poll_state_synchronize_rcu_full() reports that either
has completed. Three notification paths are taught about expedited
completion so the advance actually happens: the NOCB rcuog kthreads,
the rcu_pending() tick gate, and rcu_core().

Changelog:
RFC: https://lore.kernel.org/all/20260417231203.785172-1-puranjay@xxxxxxxxxx/
Changes in v1:
- New prep patch 1 renames struct rcu_gp_oldstate to struct rcu_gp_seq
and its fields rgos_norm/rgos_exp to norm/exp tree-wide (Frederic).
- The rcu_segcblist segment field stays named gp_seq; only its type
changes (Frederic).
- Patch 8 (NOCB wake) is reworked. v1 woke the wrong waitqueue
(rdp_gp->nocb_gp_wq via wake_nocb_gp() rather than the leaf
rnp->nocb_gp_wq[] that an rcuog kthread waiting for a GP sleeps on),
and the wait condition only checked the normal ->gp_seq. The rcuog
grace-period wait now tracks a struct rcu_gp_seq and is released via
poll_state_synchronize_rcu_full(); rcu_exp_wait_wake() wakes the leaf
node through the new rcu_nocb_exp_cleanup() (Frederic).
- rcu_pending() uses a new memory-ordering-free
poll_state_synchronize_rcu_full_unordered() to avoid memory barriers
on every tick, leaving the ordering duty to rcu_core() (Frederic).

Still open: Frederic asked whether the first smp_mb() in
poll_state_synchronize_rcu_full() is needed on the callback-advance path
(patch 6). That path still uses the fully ordered helper; only
rcu_pending() was switched to the unordered variant. Happy to revisit.

Puranjay Mohan (11):
rcu: Rename struct rcu_gp_oldstate to rcu_gp_seq
rcu/segcblist: Add SRCU and Tasks RCU wrapper functions
rcu/segcblist: Factor out rcu_segcblist_advance_compact() helper
rcu/segcblist: Track segment grace periods with struct rcu_gp_seq
rcu: Add RCU_GET_STATE_NOT_TRACKED for subsystems without expedited
GPs
rcu: Enable RCU callbacks to benefit from expedited grace periods
rcu: Update comments for gp_seq and expedited GP tracking
rcu: Wake NOCB rcuog kthreads on expedited grace period completion
rcu: Detect expedited grace period completion in rcu_pending()
rcu: Advance callbacks for expedited GP completion in rcu_core()
rcuscale: Add concurrent expedited GP threads for callback scaling
tests

include/linux/rcu_segcblist.h | 16 ++--
include/linux/rcupdate.h | 13 ++-
include/linux/rcupdate_wait.h | 2 +-
include/linux/rcutiny.h | 36 ++++-----
include/linux/rcutree.h | 29 +++----
include/trace/events/rcu.h | 5 +-
kernel/rcu/rcu.h | 13 ++-
kernel/rcu/rcu_segcblist.c | 139 ++++++++++++++++++++++----------
kernel/rcu/rcu_segcblist.h | 8 +-
kernel/rcu/rcuscale.c | 84 ++++++++++++++++++-
kernel/rcu/rcutorture.c | 30 +++----
kernel/rcu/srcutree.c | 14 ++--
kernel/rcu/tasks.h | 8 +-
kernel/rcu/tiny.c | 4 +-
kernel/rcu/tree.c | 147 ++++++++++++++++++++++------------
kernel/rcu/tree.h | 3 +-
kernel/rcu/tree_exp.h | 20 ++---
kernel/rcu/tree_nocb.h | 131 ++++++++++++++++++++++++------
mm/slab_common.c | 6 +-
19 files changed, 496 insertions(+), 212 deletions(-)


base-commit: 709d17a22bfac78765f6cbaec42e15bcd4aa4f08
--
2.53.0-Meta