Re: [PATCH rcu/next 3/3] rcu: Call trace_rcu_callback() also for bypass queuing (v2)

From: Joel Fernandes
Date: Sat Sep 17 2022 - 18:42:47 EST


On Sat, Sep 17, 2022 at 6:21 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> On Sat, Sep 17, 2022 at 05:43:06PM -0400, Joel Fernandes wrote:
> > On Sat, Sep 17, 2022 at 3:58 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
> > >
> > > On Sat, Sep 17, 2022 at 04:42:00PM +0000, Joel Fernandes (Google) wrote:
> > > > @@ -2809,17 +2825,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > > }
> > > >
> > > > check_cb_ovld(rdp);
> > > > - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> > > > +
> > > > + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) {
> > > > + __trace_rcu_callback(head, rdp);
> > > > return; // Enqueued onto ->nocb_bypass, so just leave.
> > > > + }
> > >
> > > I think the bypass enqueues should be treated differently. Either with extending
> > > the current trace_rcu_callback/trace_rcu_kvfree_callback (might break tools)
> > >
> > > or
> > > with creating a new trace_rcu_callback_bypass()/trace_rcu_kvfree_callback_bypass().
> > >
> > > Those could later be paired with a trace_rcu_bypass_flush().
> >
> > I am having a hard time seeing why it should be treated differently.
> > We already increment the length of the main segcblist even when
> > bypassing. Why not just call the trace point instead of omitting it?
>
> I'm not suggesting to omit it. I'm suggesting to improve its precision.

That's exactly what I'm doing :-). It is imprecise the way it is, by
calling it where it needs to be (not omitting it), I am making it more
precise.

> > Otherwise it becomes a bit confusing IMO (say someone does not enable
> > your proposed new bypass tracepoint and only enables the existing one,
> > then they would see weird traces where call_rcu is called but their
> > traces are missing trace_rcu_callback).
>
> Well, if they decided to see only half of the information...

It is not that they decide, there are lots of RCU tracepoints and it
is likely common to enable just a few of them.

> > Not to mention - your
> > suggestion will also complicate writing tools that use the existing
> > rcu_callback tracepoint to monitor call_rcu().
>
> If we add another tracepoint, the prototype will be the same as the
> existing one, not many lines to add. If instead we extend the existing
> tracepoint, it's merely just a flag to check or ignore.
>
> OTOH your suggestion doesn't provide any bypass related information.

Bypass related information is not relevant to this patch. I am already
using trace_rcu_callback() in my (yet to be released) rcutop, and I
don't use it for any bypass information.

> > Also if you see the definition of rcu_callback, "Tracepoint for the
> > registration of a single RCU callback function.". That pretty much
> > fits the usage here.
>
> Doesn't tell if it's a bypass or not.

It doesn't tell a lot of things, so what? Saying that it is bypass is
not the point of this patch.

> > As for tracing of the flushing, I don’t care about tracing that at the
> > moment using tracepoints
>
> You will soon enough ;-)

I respect your experience in the matter :-)

> > but I don’t mind if it is added later.
> > Maybe let’s let Paul help resolve our disagreement on this one? :)
>
> FWIW, I would be personally interested in such tracepoints (or the extension

Understood.

> of the existing ones, whichever way you guys prefer), they would be of great help
> for debugging.
>
> Also if rcu_top is ever released, I really hope the kernel will be ready in
> case we want the tool to display bypass related informations.

I feel the main issue you have with my patch is that it does not add
the information you want, however the information you mention is
beyond the scope of the patch and can in future/different patches.
This patch only fixes an *existing* broken tracepoint.

I can certainly add a new bypass-related tracepoint in the future, but
I don't see how that's relevant to my patch.

> Please be careful while designing tracepoints that may be consumed by userspace
> released tools. Such tracepoints eventually turn into ABI and there is no way
> back after that.

Sure thing, that's why I'm fixing the broken tracepoint. Some
registered callbacks can be invisible to the user.

- Joel