Re: [PATCH rcu/next 2/3] rcu: Move trace_rcu_callback() before bypassing

From: Frederic Weisbecker
Date: Fri Sep 16 2022 - 07:11:30 EST


On Thu, Sep 15, 2022 at 12:14:18AM +0000, Joel Fernandes (Google) wrote:
> If any CB is queued into the bypass list, then trace_rcu_callback() does
> not show it. This makes it not clear when a callback was actually
> queued, as you only end up getting a trace_rcu_invoke_callback() trace.
> Fix it by moving trace_rcu_callback() before
> trace_rcu_nocb_try_bypass().
>
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> ---
> kernel/rcu/tree.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 5ec97e3f7468..9fe581be8696 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2809,10 +2809,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> }
>
> check_cb_ovld(rdp);
> - if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> - return; // Enqueued onto ->nocb_bypass, so just leave.
> - // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
> - rcu_segcblist_enqueue(&rdp->cblist, head);
> +
> if (__is_kvfree_rcu_offset((unsigned long)func))
> trace_rcu_kvfree_callback(rcu_state.name, head,
> (unsigned long)func,
> @@ -2821,6 +2818,11 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> trace_rcu_callback(rcu_state.name, head,
> rcu_segcblist_n_cbs(&rdp->cblist));
>
> + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
> + return; // Enqueued onto ->nocb_bypass, so just leave.
> + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
> + rcu_segcblist_enqueue(&rdp->cblist, head);
> +
> trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued"));
>
> /* Go handle any RCU core processing required. */

Two subtle changes induced here:

* rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge deal
but still, if this races with callbacks invocation, we may on some rare occasion
read stale numbers on traces while enqueuing (think about rcu_top for example)

* trace_rcu_callback() will now show the number of callbacks _before_ enqueuing
instead of _after_. Not sure if it matters, but sometimes tools rely on trace
events.

To avoid all that, how about a new trace_rcu_nocb_bypass() instead?

Thanks.