Re: [PATCH v8 3/6] rcu/trace: Add tracing for how segcb list changes

From: Frederic Weisbecker
Date: Mon Oct 26 2020 - 07:59:18 EST


On Wed, Oct 21, 2020 at 03:08:10PM -0400, Joel Fernandes (Google) wrote:
> Track how the segcb list changes before/after acceleration, during
> queuing and during dequeuing.
>
> This has proved useful to discover an optimization to avoid unwanted GP
> requests when there are no callbacks accelerated. The overhead is minimal as
> each segment's length is now stored in the respective segment.
>
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> ---
> include/trace/events/rcu.h | 25 +++++++++++++++++++++++++
> kernel/rcu/rcu_segcblist.c | 31 +++++++++++++++++++++++++++++++
> kernel/rcu/rcu_segcblist.h | 5 +++++
> kernel/rcu/tree.c | 9 +++++++++
> 4 files changed, 70 insertions(+)
>
> diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> index 155b5cb43cfd..9f2237d9b0c8 100644
> --- a/include/trace/events/rcu.h
> +++ b/include/trace/events/rcu.h
> @@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
> __entry->qlen)
> );
>
> +TRACE_EVENT_RCU(rcu_segcb,

You might extend the segcblist tracing in the future to trace queue/dequeue/merge whatever...
Which would give trace_rcu_segcb_queue, trace_rcu_segcb_dequeue, etc...

So I suggest you rename this one to something like rcu_segcb_stats for precision.


> +
> + TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
> +
> + TP_ARGS(ctx, cb_count, gp_seq),
> +
> + TP_STRUCT__entry(
> + __field(const char *, ctx)
> + __array(int, cb_count, RCU_CBLIST_NSEGS)
> + __array(unsigned long, gp_seq, RCU_CBLIST_NSEGS)
> + ),
> +
> + TP_fast_assign(
> + __entry->ctx = ctx;
> + memcpy(__entry->cb_count, cb_count, RCU_CBLIST_NSEGS * sizeof(int));
> + memcpy(__entry->gp_seq, gp_seq, RCU_CBLIST_NSEGS * sizeof(unsigned long));
> + ),
> +
> + TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, NEXT=%d) "
> + "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
> + __entry->cb_count[0], __entry->cb_count[1], __entry->cb_count[2], __entry->cb_count[3],
> + __entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
> +
> +);
> +
> /*
> * Tracepoint for the registration of a single RCU callback of the special
> * kvfree() form. The first argument is the RCU type, the second argument
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 357c19bbcb00..b0aaa51e0ee6 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -14,6 +14,7 @@
> #include <linux/types.h>
>
> #include "rcu_segcblist.h"
> +#include "rcu.h"
>
> /* Initialize simple callback list. */
> void rcu_cblist_init(struct rcu_cblist *rclp)
> @@ -328,6 +329,36 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
> rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> }
>
> +/*
> + * Return how many CBs each segment along with their gp_seq values.
> + *
> + * This function is O(N) where N is the number of segments. Only used from
> + * tracing code which is usually disabled in production.
> + */
> +#ifdef CONFIG_RCU_TRACE
> +static void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
> + int cbcount[RCU_CBLIST_NSEGS],
> + unsigned long gpseq[RCU_CBLIST_NSEGS])
> +{
> + int i;
> +
> + for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
> + cbcount[i] = rcu_segcblist_get_seglen(rsclp, i);
> + gpseq[i] = rsclp->gp_seq[i];
> + }
> +}

So that is called all the time even if the trace event isn't enabled. The
goal of trace events are also to avoid the overhead of tracing when its off.

This should be moved inside the trace event definition. We can even avoid the
loop altogether.

Thanks.