Re: [PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers use SRCU

From: Joel Fernandes
Date: Tue May 01 2018 - 11:16:21 EST


On Tue, May 1, 2018 at 7:34 AM Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
wrote:

> On Tue, May 01, 2018 at 10:24:01AM -0400, Steven Rostedt wrote:
> > On Mon, 30 Apr 2018 18:42:03 -0700
> > Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> >
> > > In recent tests with IRQ on/off tracepoints, a large performance
> > > overhead ~10% is noticed when running hackbench. This is root caused
to
> > > calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
> > > tracepoint code. Following a long discussion on the list [1] about
this,
> > > we concluded that srcu is a better alternative for use during rcu
idle.
> > > Although it does involve extra barriers, its lighter than the
sched-rcu
> > > version which has to do additional RCU calls to notify RCU idle about
> > > entry into RCU sections.
> > >
> > > In this patch, we change the underlying implementation of the
> > > trace_*_rcuidle API to use SRCU. This has shown to improve performance
> > > alot for the high frequency irq enable/disable tracepoints.

> [ . . . ]

> > > --- a/kernel/tracepoint.c
> > > +++ b/kernel/tracepoint.c
> > > @@ -31,6 +31,9 @@
> > > extern struct tracepoint * const __start___tracepoints_ptrs[];
> > > extern struct tracepoint * const __stop___tracepoints_ptrs[];
> > >
> > > +DEFINE_SRCU(tracepoint_srcu);
> > > +EXPORT_SYMBOL_GPL(tracepoint_srcu);
> > > +
> > > /* Set to 1 to enable tracepoint debug output */
> > > static const int tracepoint_debug;
> > >
> > > @@ -67,11 +70,16 @@ static inline void *allocate_probes(int count)
> > > return p == NULL ? NULL : p->probes;
> > > }
> > >
> > > -static void rcu_free_old_probes(struct rcu_head *head)
> > > +static void srcu_free_old_probes(struct rcu_head *head)
> > > {
> > > kfree(container_of(head, struct tp_probes, rcu));
> > > }
> > >
> > > +static void rcu_free_old_probes(struct rcu_head *head)
> > > +{
> > > + call_srcu(&tracepoint_srcu, head, srcu_free_old_probes);
> >
> > Hmm, is it OK to call call_srcu() from a call_rcu() callback? I guess
> > it would be.

> It is perfectly legal, and quite a bit simpler than setting something
> up to wait for both to complete concurrently.

Cool. Also in this case if we call both in sequence, then I felt there
could be a race to free the old data since both callbacks would try to do
the same thing. The same thing being freeing of the same set of old probes
which would need some synchronization between the 2 callbacks. With the
chaining, since the ordering is assured there wouldn't be a question of
such a race. I could add this reasoning to the changelog as well.

thanks,

- Joel