Re: [PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers use SRCU
From: Paul E. McKenney
Date: Tue May 01 2018 - 11:20:28 EST
On Tue, May 01, 2018 at 03:16:02PM +0000, Joel Fernandes wrote:
> On Tue, May 1, 2018 at 7:34 AM Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> wrote:
>
> > On Tue, May 01, 2018 at 10:24:01AM -0400, Steven Rostedt wrote:
> > > On Mon, 30 Apr 2018 18:42:03 -0700
> > > Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> > >
> > > > In recent tests with IRQ on/off tracepoints, a large performance
> > > > overhead ~10% is noticed when running hackbench. This is root caused
> to
> > > > calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
> > > > tracepoint code. Following a long discussion on the list [1] about
> this,
> > > > we concluded that srcu is a better alternative for use during rcu
> idle.
> > > > Although it does involve extra barriers, its lighter than the
> sched-rcu
> > > > version which has to do additional RCU calls to notify RCU idle about
> > > > entry into RCU sections.
> > > >
> > > > In this patch, we change the underlying implementation of the
> > > > trace_*_rcuidle API to use SRCU. This has shown to improve performance
> > > > alot for the high frequency irq enable/disable tracepoints.
>
> > [ . . . ]
>
> > > > --- a/kernel/tracepoint.c
> > > > +++ b/kernel/tracepoint.c
> > > > @@ -31,6 +31,9 @@
> > > > extern struct tracepoint * const __start___tracepoints_ptrs[];
> > > > extern struct tracepoint * const __stop___tracepoints_ptrs[];
> > > >
> > > > +DEFINE_SRCU(tracepoint_srcu);
> > > > +EXPORT_SYMBOL_GPL(tracepoint_srcu);
> > > > +
> > > > /* Set to 1 to enable tracepoint debug output */
> > > > static const int tracepoint_debug;
> > > >
> > > > @@ -67,11 +70,16 @@ static inline void *allocate_probes(int count)
> > > > return p == NULL ? NULL : p->probes;
> > > > }
> > > >
> > > > -static void rcu_free_old_probes(struct rcu_head *head)
> > > > +static void srcu_free_old_probes(struct rcu_head *head)
> > > > {
> > > > kfree(container_of(head, struct tp_probes, rcu));
> > > > }
> > > >
> > > > +static void rcu_free_old_probes(struct rcu_head *head)
> > > > +{
> > > > + call_srcu(&tracepoint_srcu, head, srcu_free_old_probes);
> > >
> > > Hmm, is it OK to call call_srcu() from a call_rcu() callback? I guess
> > > it would be.
>
> > It is perfectly legal, and quite a bit simpler than setting something
> > up to wait for both to complete concurrently.
>
> Cool. Also in this case if we call both in sequence, then I felt there
> could be a race to free the old data since both callbacks would try to do
> the same thing. The same thing being freeing of the same set of old probes
> which would need some synchronization between the 2 callbacks. With the
> chaining, since the ordering is assured there wouldn't be a question of
> such a race. I could add this reasoning to the changelog as well.
Actually, as long as you have a solid happens-before between both of the
callbacks and the freeing, you are in good shape. A release-acquire would
work fine, as would a lock acquired in both callbacks and then acquired
(and possibly released) before the free.
Thanx, Paul