Re: [PATCH v2 1/8] rcu: Introduce call_rcu_lazy() API implementation

From: Paul E. McKenney
Date: Tue Jul 12 2022 - 18:41:58 EST


On Tue, Jul 12, 2022 at 05:10:41PM -0400, Joel Fernandes wrote:
>
>
> On 7/12/2022 5:04 PM, Paul E. McKenney wrote:
> > On Tue, Jul 12, 2022 at 04:53:48PM -0400, Joel Fernandes wrote:
> >>
> >> On 7/10/2022 12:03 PM, Paul E. McKenney wrote:
> >> [..]
> >>>>>> + // Note that if the bypass list has lazy CBs, and the main list is
> >>>>>> + // empty, and rhp happens to be non-lazy, then we end up flushing all
> >>>>>> + // the lazy CBs to the main list as well. That's the right thing to do,
> >>>>>> + // since we are kick-starting RCU GP processing anyway for the non-lazy
> >>>>>> + // one, we can just reuse that GP for the already queued-up lazy ones.
> >>>>>> + if ((rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy && !lazy) ||
> >>>>>> + (lazy && n_lazy_cbs >= qhimark)) {
> >>>>>> rcu_nocb_lock(rdp);
> >>>>>> *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
> >>>>>> if (*was_alldone)
> >>>>>> trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> >>>>>> - TPS("FirstQ"));
> >>>>>> - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> >>>>>> + lazy ? TPS("FirstLazyQ") : TPS("FirstQ"));
> >>>>>> + WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false));
> >>>>>
> >>>>> The "false" here instead of "lazy" is because the caller is to do the
> >>>>> enqueuing, correct?
> >>>>
> >>>> There is no difference between using false or lazy here, because the bypass
> >>>> flush is not also enqueuing the lazy callback, right?
> >>>>
> >>>> We can also pass lazy instead of false if that's less confusing.
> >>>>
> >>>> Or maybe I missed the issue you're raising?
> >>>
> >>> I am mostly checking up on your intended meaning of "lazy" in various
> >>> contexts. It could mean only that the caller requested laziness, or in
> >>> some cases it could mean that the callback actually will be lazy.
> >>>
> >>> I can rationalize the "false" above as a "don't care" in this case
> >>> because (as you say) there is not callback. In which case this code
> >>> is OK as is, as long as the header comment for rcu_nocb_flush_bypass()
> >>> clearly states that this parameter has meaning only when there really
> >>> is a callback being queued.
> >>
> >> I decided to change this and the below to "lazy" variable instead of
> >> true/false, as the code is cleaner and less confusing IMO. It makes
> >> sense to me and in my testing it works fine. Hope that's Ok with you.
> >
> > Sounds plausible.
> >
> >> About changing the lazy length count to a flag, one drawback of doing
> >> that is, say if there are some non-lazy CBs in the bypass list, then the
> >> lazy shrinker will end up reporting an inaccurate count. Also
> >> considering that it might be harder to add the count back later say if
> >> we need it for tracing, I would say lets leave it as is. I will keep the
> >> counter for v3 and we can discuss. Does that sound good to you?
> >
> > You lost me on this one. If there are any non-lazy callbacks, the whole
> > bypass list is already being treated as non-lazy, right? If so, then
> > the lazy shrinker should report the full count if all callbacks are lazy,
> > and should report none otherwise. Or am I missing something here?
> >
>
> That's one way to interpret it, another way is say there were a 1000
> lazy CBs, and now 1 non-lazy came in. The shrinker could report the lazy
> count as 0 per your interpretation. Yes its true they will get flushed
> out in the next jiffie, but for that time instant, the number of lazy
> CBs in the list is not zero! :) Yeah OK its a weak argument, still an
> argument! ;-)
>
> In any case, we saw the need for the length of the segcb lists to figure
> out things via tracing, so I suspect we may need this in the future as
> well, its a small cost so I would rather keep it if that's Ok with you! :)

OK, being needed for tracing/diagnostics is a somewhat less weak argument...

Let's see what v3 looks like.

Thanx, Paul