Re: [PATCH v2] tracing/osnoise: Force quiescent states while tracing

From: Steven Rostedt
Date: Wed Mar 09 2022 - 12:01:15 EST


On Mon, 7 Mar 2022 19:07:40 +0100
Nicolas Saenz Julienne <nsaenzju@xxxxxxxxxx> wrote:

> At the moment running osnoise on a nohz_full CPU or uncontested FIFO
> priority and a PREEMPT_RCU kernel might have the side effect of
> extending grace periods too much. This will entice RCU to force a
> context switch on the wayward CPU to end the grace period, all while
> introducing unwarranted noise into the tracer. This behaviour is
> unavoidable as overly extending grace periods might exhaust the system's
> memory.
>
> This same exact problem is what extended quiescent states (EQS) were
> created for, conversely, rcu_momentary_dyntick_idle() emulates them by
> performing a zero duration EQS. So let's make use of it.
>
> In the common case rcu_momentary_dyntick_idle() is fairly inexpensive:
> atomically incrementing a local per-CPU counter and doing a store. So it
> shouldn't affect osnoise's measurements (which has a 1us granularity),
> so we'll call it unanimously.
>
> The uncommon case involve calling rcu_momentary_dyntick_idle() after
> having the osnoise process:
>
> - Receive an expedited quiescent state IPI with preemption disabled or
> during an RCU critical section. (activates rdp->cpu_no_qs.b.exp
> code-path).
>
> - Being preempted within in an RCU critical section and having the
> subsequent outermost rcu_read_unlock() called with interrupts
> disabled. (t->rcu_read_unlock_special.b.blocked code-path).
>
> Neither of those are possible at the moment, and are unlikely to be in
> the future given the osnoise's loop design. On top of this, the noise
> generated by the situations described above is unavoidable, and if not
> exposed by rcu_momentary_dyntick_idle() will be eventually seen in
> subsequent rcu_read_unlock() calls or schedule operations.
>
> Fixes: bce29ac9ce0b ("trace: Add osnoise tracer")
> Signed-off-by: Nicolas Saenz Julienne <nsaenzju@xxxxxxxxxx>
> Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> ---
>
> Changes since v1:
> - Use local_irq_{enable,disable}()
> - Update commit message and comments to better explain RCU's behaviour
> - Get rid of nohz_full and tick checks
> - Comment on rcu_momentary_dyntick_idle()'s eventual execution cost
>
> kernel/trace/trace_osnoise.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)

Applied, thanks Nicolas!

-- Steve

>
> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> index a96d777350fa..ae5e314d7083 100644
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -1388,6 +1388,26 @@ static int run_osnoise(void)
> osnoise_stop_tracing();
> }
>
> + /*
> + * In some cases, notably when running on a nohz_full CPU with
> + * a stopped tick PREEMPT_RCU has no way to account for QSs.
> + * This will eventually cause unwarranted noise as PREEMPT_RCU
> + * will force preemption as the means of ending the current
> + * grace period. We avoid this problem by calling
> + * rcu_momentary_dyntick_idle(), which performs a zero duration
> + * EQS allowing PREEMPT_RCU to end the current grace period.
> + * This call shouldn't be wrapped inside an RCU critical
> + * section.
> + *
> + * Note that in non PREEMPT_RCU kernels QSs are handled through
> + * cond_resched()
> + */
> + if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
> + local_irq_disable();
> + rcu_momentary_dyntick_idle();
> + local_irq_enable();
> + }
> +
> /*
> * For the non-preemptive kernel config: let threads runs, if
> * they so wish.