Re: [PATCH 1/7] rcu/segcblist: Prevent useless GP start if no CBs to accelerate

From: Paul E. McKenney
Date: Thu Jun 18 2020 - 19:09:38 EST


On Thu, Jun 18, 2020 at 03:11:19PM -0700, Paul E. McKenney wrote:
> On Thu, Jun 18, 2020 at 04:29:49PM -0400, Joel Fernandes (Google) wrote:
>
> First, this looks like a very nice optimization, thank you!
>
> > rcu_segcblist_accelerate() returns true if a GP is to be
> > started/requested and false if not. During tracing, I found that it is
> > asking that GPs be requested
>
> s/GPs/unnecessary GPs/? Plus "." at end of the sentence.
>
> > The exact flow seems to be something like:
> > 1. Callbacks are queued on CPU A - into the NEXT list.
> > 2. softirq runs on CPU A, accelerate all CBs from NEXT->WAIT and request a GP X.
> > 3. GP thread wakes up on another CPU, starts the GP X and requests QS from CPU A.
> > 4. CPU A's softirq runs again presumably because of #3.
>
> Yes, that is one reason RCU softirq might run again.
>
> > 5. CPU A's softirq now does acceleration again, this time no CBs are
> > accelerated since last attempt, but it still requests GP X+1 which
> > could be useless.
>
> I can't think of a case where this request helps. How about: "but
> it still unnecessarily requests GP X+1"?
>
> > The fix is, prevent the useless GP start if we detect no CBs are there
> > to accelerate.
> >
> > With this, we have the following improvement in short runs of
> > rcutorture (5 seconds each):
> > +----+-------------------+-------------------+
> > | # | Number of GPs | Number of Wakeups |
> > +====+=========+=========+=========+=========+
> > | 1 | With | Without | With | Without |
> > +----+---------+---------+---------+---------+
> > | 2 | 75 | 89 | 113 | 119 |
> > +----+---------+---------+---------+---------+
> > | 3 | 62 | 91 | 105 | 123 |
> > +----+---------+---------+---------+---------+
> > | 4 | 60 | 79 | 98 | 110 |
> > +----+---------+---------+---------+---------+
> > | 5 | 63 | 79 | 99 | 112 |
> > +----+---------+---------+---------+---------+
> > | 6 | 57 | 89 | 96 | 123 |
> > +----+---------+---------+---------+---------+
> > | 7 | 64 | 85 | 97 | 118 |
> > +----+---------+---------+---------+---------+
> > | 8 | 58 | 83 | 98 | 113 |
> > +----+---------+---------+---------+---------+
> > | 9 | 57 | 77 | 89 | 104 |
> > +----+---------+---------+---------+---------+
> > | 10 | 66 | 82 | 98 | 119 |
> > +----+---------+---------+---------+---------+
> > | 11 | 52 | 82 | 83 | 117 |
> > +----+---------+---------+---------+---------+
>
> So the reductions in wakeups ranges from 5% to 40%, with almost a 20%
> overall reduction in wakeups across all the runs. That should be of
> some use to someone. ;-)
>
> I do run rcutorture quite a bit, but is there a more real-world
> benchmark that could be tried?
>
> > Cc: urezki@xxxxxxxxx
> > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>

As discussed over IRC, I updated the patch as shown below. Does that
work for you?

Thanx, Paul

------------------------------------------------------------------------

commit ec037e1f438074eb16fd68a63d699fc419c9ba0c
Author: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
Date: Thu Jun 18 16:29:49 2020 -0400

rcu/segcblist: Prevent useless GP start if no CBs to accelerate

The rcu_segcblist_accelerate() function returns true iff it is necessary
to request another grace period. A tracing session showed that this
function unnecessarily requests grace periods.

For exmaple, consider the following sequence of events:
1. Callbacks are queued only on the NEXT segment of CPU A's callback list.
2. CPU A runs RCU_SOFTIRQ, accelerating these callbacks from NEXT to WAIT.
3. Thus rcu_segcblist_accelerate() returns true, requesting grace period N.
4. RCU's grace-period kthread wakes up on CPU B and starts grace period N.
4. CPU A notices the new grace period and invokes RCU_SOFTIRQ.
5. CPU A's RCU_SOFTIRQ again invokes rcu_segcblist_accelerate(), but
there are no new callbacks. However, rcu_segcblist_accelerate()
nevertheless (uselessly) requests a new grace period N+1.

This extra grace period results in additional lock contention and also
additional wakeups, all for no good reason.

This commit therefore adds a check to rcu_segcblist_accelerate() that
prevents the return of true when there are no new callbacks.

This change reduces the number of grace periods (GPs) and wakeups in each
of eleven five-second rcutorture runs as follows:

+----+-------------------+-------------------+
| # | Number of GPs | Number of Wakeups |
+====+=========+=========+=========+=========+
| 1 | With | Without | With | Without |
+----+---------+---------+---------+---------+
| 2 | 75 | 89 | 113 | 119 |
+----+---------+---------+---------+---------+
| 3 | 62 | 91 | 105 | 123 |
+----+---------+---------+---------+---------+
| 4 | 60 | 79 | 98 | 110 |
+----+---------+---------+---------+---------+
| 5 | 63 | 79 | 99 | 112 |
+----+---------+---------+---------+---------+
| 6 | 57 | 89 | 96 | 123 |
+----+---------+---------+---------+---------+
| 7 | 64 | 85 | 97 | 118 |
+----+---------+---------+---------+---------+
| 8 | 58 | 83 | 98 | 113 |
+----+---------+---------+---------+---------+
| 9 | 57 | 77 | 89 | 104 |
+----+---------+---------+---------+---------+
| 10 | 66 | 82 | 98 | 119 |
+----+---------+---------+---------+---------+
| 11 | 52 | 82 | 83 | 117 |
+----+---------+---------+---------+---------+

The reduction in the number of wakeups ranges from 5% to 40%.

Cc: urezki@xxxxxxxxx
[ paulmck: Rework commit log and comment. ]
Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 9a0f661..2d2a6b6b9 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -475,8 +475,16 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
* Also advance to the oldest segment of callbacks whose
* ->gp_seq[] completion is at or after that passed in via "seq",
* skipping any empty segments.
+ *
+ * Note that segment "i" (and any lower-numbered segments
+ * containing older callbacks) will be unaffected, and their
+ * grace-period numbers remain unchanged. For example, if i ==
+ * WAIT_TAIL, then neither WAIT_TAIL nor DONE_TAIL will be touched.
+ * Instead, the CBs in NEXT_TAIL will be merged with those in
+ * NEXT_READY_TAIL and the grace-period number of NEXT_READY_TAIL
+ * would be updated. NEXT_TAIL would then be empty.
*/
- if (++i >= RCU_NEXT_TAIL)
+ if (rcu_segcblist_restempty(rsclp, i) || ++i >= RCU_NEXT_TAIL)
return false;

/*