[PATCH v2 tip/core/rcu 0/40] SRCU callback parallelization for 4.12

From: Paul E. McKenney
Date: Mon Apr 17 2017 - 19:45:09 EST


This v2 series moves SRCU from its traditional single per-srcu_struct
callback queue to per-srcu_struct/per-CPU callback queues. This involves
abstracting functionality from Tree RCU, which results in a large
conflict footprint, which in turn results in some otherwise unrelated
patches coming along for the ride.

1. Maintain special bits at bottom of ->dynticks counter.
This is for some upcoming MM work. My intent was to hold
it until that work was ready, but merge conflicts dictated
otherwise. If the MM work does not appear soonish, I will
manually revert this patch.

2. Make arch select smp_mb__after_unlock_lock() strength, which
gets rid of an arch-specific #ifdef.

3. Consolidate SRCU batch checking into rcu_all_batches_empty().

4. Check for tardy grace-period activity in cleanup_srcu_struct().

5-7. Semicolon inside RCU_TRACE() for various parts of RCU.

8. Pull rcu_sched_qs_mask into rcu_dynticks structure in order to
eliminate an isolated per-CPU variable.

9. Pull rcu_qs_ctr into rcu_dynticks structure.

10. Eliminate flavor scan in rcu_momentary_dyntick_idle() to
reduce semi-common-case context-switch overhead.

11. Place guard on rcu_all_qs() and rcu_note_context_switch()
actions to reduce common-case scheduler-fastpath overhead.

12. Default RCU_FANOUT_LEAF to 16 unless explicitly changed.

13. Abstract multi-tail callback list handling for SRCU.

14. Allow SRCU to access rcu_scheduler_active.

15. Allow early boot use of synchronize_srcu(), though not yet
mid-boot use.

16. Add single-element dequeue functions to rcu_segcblist for
debug use.

17. Move rcu_seq_start() and friends to rcu.h for SRCU's benefit.

18. Expedited wakeups need to be fully ordered.

19. Fix warning in rcu_seq_end().

20. Push srcu_advance_batches() fastpath into common case as a
step towards callback parallelization.

21. Move to state-based grace-period sequencing, also as a step
towards callback parallelization.

22. Add grace-period sequence numbers to SRCU.

23. Use rcu_segcblist to track SRCU callbacks.

24. Move combining-tree definitions for SRCU's benefit.

25. Move rcu_init_levelspread() to rcu_tree_node.h for SRCU's benefit.

26. Remove redundant levelcnt[] array from rcu_init_one().

27. Move rcu_node traversal macros to rcu.h for SRCU's benefit.

28. Make num_rcu_lvl[] array be external for SRCU's benefit.

29. Fix bogus try_check_zero() comment.

30. Improve rcu_seq grace-period-counter abstraction for SRCU's

31. Allow a second bit in rcu_seq for SRCU state.

32. Merge ->srcu_state into ->srcu_gp_seq to allow atomic updates.

33. Provide crude control of expedited SRCU grace periods.

34. Use static initialization for "srcu" in mm/mmu_notifier.c.

35. Create a tiny SRCU for bloatwatch/tinification.

36. Print Tiny SRCU reader statistics in rcutorture.

37. Introduce CLASSIC_SRCU Kconfig option for those who do not
wish to help debug Tree SRCU.

38. Parallelize SRCU callback handling.

39. Make non-preemptive schedule be Tasks RCU quiescent state.

Updates since v1:

o Incorporate feedback from Peter Zijlstra.

o Dropped v1 patches 8-10 ("Make various parts of RCU do deferred
NOCB wakeups in order to prevent callback blockages, and thus
hangs"). These patches turned out to be papering over a no-CBs
CPU design flaw. There will be patches in v4.13 to fix the design
flaw directly.

o Added v2 patch #34 ("Use static initialization for "srcu" in
mm/mmu_notifier.c"), moving it from its v1 location in the
fixes series.

o Added v2 patch #39 ("Make non-preemptive schedule be Tasks RCU
quiescent state") for the benefit of upcoming ftrace work at
Steve Rostedt's request.

Thanx, Paul


/kernel/rcu/rcu_segcblist.h | 670 -----
b/Documentation/RCU/Design/Data-Structures/Data-Structures.html | 36
b/arch/Kconfig | 3
b/arch/powerpc/Kconfig | 1
b/include/linux/rcu_node_tree.h | 105
b/include/linux/rcu_segcblist.h | 720 +++++
b/include/linux/rcupdate.h | 17
b/include/linux/rcutiny.h | 24
b/include/linux/rcutree.h | 5
b/include/linux/srcu.h | 112
b/include/linux/srcuclassic.h | 101
b/include/linux/srcutiny.h | 81
b/include/linux/srcutree.h | 171 +
b/init/Kconfig | 33
b/kernel/rcu/Makefile | 6
b/kernel/rcu/rcu.h | 165 +
b/kernel/rcu/rcu_segcblist.h | 670 +++++
b/kernel/rcu/rcutorture.c | 39
b/kernel/rcu/srcu.c | 846 +++---
b/kernel/rcu/srcutiny.c | 215 +
b/kernel/rcu/srcutree.c | 1252 ++++++++--
b/kernel/rcu/tiny.c | 20
b/kernel/rcu/tiny_plugin.h | 13
b/kernel/rcu/tree.c | 657 ++---
b/kernel/rcu/tree.h | 174 -
b/kernel/rcu/tree_exp.h | 25
b/kernel/rcu/tree_plugin.h | 62
b/kernel/rcu/tree_trace.c | 26
b/kernel/rcu/update.c | 53
b/kernel/sched/core.c | 2
b/mm/mmu_notifier.c | 14
31 files changed, 4287 insertions(+), 2031 deletions(-)