[PATCH tip/core/rcu 0/21] Contention reduction for v4.18

From: Paul E. McKenney
Date: Sun Apr 22 2018 - 23:02:04 EST


Hello!

This series reduces lock contention on the root rcu_node structure,
and is also the first precursor to TBD changes to consolidate the three
RCU flavors (RCU-bh, RCU-preempt, and RCU-sched) into one.

1. Improve non-root rcu_cbs_completed() accuracy, thus reducing the
need to acquire the root rcu_node structure's ->lock. This also
eliminates the need to reassign callbacks to an earlier grace
period, which enables introduction of funnel locking in a later
commit, which further reduces contention.

2. Make rcu_start_future_gp()'s grace-period check more precise,
eliminating one need for forward-progress failsafe checks
that acquire the root rcu_node structure's ->lock.

3. Create (and make use of) accessors for the ->need_future_gp[]
array to enable easy changes in size.

4. Make rcu_gp_kthread() check for early-boot activity, which was
another situation needing failsafe checks.

5. Make rcu_gp_cleanup() more accurately predict need for new GP.
This eliminates the need for both failsafe checks and extra
grace-period kthread wakeups.

6. Avoid losing ->need_future_gp[] values due to GP start/end races
by expanding this array from two elements to four.

7. Make rcu_future_needs_gp() check all ->need_future_gps[] elements,
again to eliminate a need for both failsafe checks and extra
grace-period kthread wakeups.

8. Convert ->need_future_gp[] array to boolean, given that there
is no longer a need to count the number of requests for a
future grace period.

9. Make rcu_migrate_callbacks wake GP kthread when needed, which
again eliminates a need for failsafe checks.

10. Avoid __call_rcu_core() root rcu_node ->lock acquisition, which
was one of the failsafe checks that many of the above patches
were making safe to remove.

11. Switch __rcu_process_callbacks() to rcu_accelerate_cbs(), which
was one of the failsafe checks that many of the above patches
were making safe to remove. (Yes, this one also acquired the
root rcu_node structure's ->lock, and was in fact the lock
acquisition that was showing up in Nick Piggin's traces.)

12. Put ->completed into an unsigned long instead of an int. (The
"int" was harmless because only the low-order bits were used,
but it was still an accident waiting to happen.)

13. Clear requests other than RCU_GP_FLAG_INIT at grace-period end.
This prevents premature quiescent-state forcing that might
otherwise occur due to requests posted when the grace period
was already almost done.

14. Inline rcu_start_gp_advanced() into rcu_start_future_gp().
This brings RCU down to only one function to start a grace
period, in happen contrast to the need to choose correctly
between three of them before this patch series.

15. Make rcu_start_future_gp() caller select grace period to avoid
duplicate grace-period selection. (We are going to like this
grace period so much that we selected it twice!)

16. Add funnel locking to rcu_start_this_gp(), the point being to
reduce lock contention, especially on large systems.

17. Make rcu_start_this_gp() check for out-of-range requests.
If this check triggers, that indicates a bug in a caller of
rcu_start_this_gp() or that the ->need_future_gp[] array needs
to be even bigger, most likely the former. More importantly, it
avoids one possible cause of otherwise silent grace-period hangs.

18. The rcu_gp_cleanup() function does not need cpu_needs_another_gp()
because funnel locking summarizes the need for future
grace periods in the root rcu_node structure's ->lock,
which rcu_gp_cleanup() already holds for other reasons.

19. Simplify and inline cpu_needs_another_gp(), which used to be a key
part of the no-longer-required forward-progress failsafe checks.

20. Drop early GP request check from rcu_gp_kthread(). Yes, it
was added above in order avoid grace-period hangs, but at this
point in the series is no longer needed. All in the name of
bisectability.

21. Update list of rcu_future_grace_period() trace events to reflect
strings added above.

Thanx, Paul

------------------------------------------------------------------------

include/trace/events/rcu.h | 13 -
kernel/rcu/rcu_segcblist.c | 18 -
kernel/rcu/rcu_segcblist.h | 2
kernel/rcu/tree.c | 406 ++++++++++++++++-----------------------------
kernel/rcu/tree.h | 24 ++
kernel/rcu/tree_plugin.h | 28 ---
6 files changed, 182 insertions(+), 309 deletions(-)