Re: [PATCH v2 1/1] rcu/nocb: Add an option to ON/OFF an offloading from RT context

From: Joel Fernandes
Date: Wed May 11 2022 - 23:11:29 EST


On Wed, May 11, 2022 at 10:57:03AM +0200, Uladzislau Rezki (Sony) wrote:
> Introduce a RCU_NOCB_CPU_CB_BOOST kernel option. So a user can
> decide if an offloading has to be done in a high-prio context or
> not. Please note an option depends on RCU_NOCB_CPU and RCU_BOOST
> parameters. For CONFIG_PREEMPT_RT kernel both RCU_BOOST and the
> RCU_NOCB_CPU_CB_BOOST are active by default.
>
> This patch splits the CONFIG_RCU_BOOST config into two peaces:
> a) boosting preempted RCU readers and the kthreads which are
> directly responsible for driving expedited grace periods
> forward;
> b) boosting offloading-kthreads in a way that their scheduling
> class are changed from SCHED_NORMAL to SCHED_FIFO.
>
> The main reason of such split is, for example on Android there
> are some workloads which require fast expedited grace period to
> be done whereas offloading in RT context can lead to starvation
> and hogging a CPU for a long time what is not acceptable for
> latency sensitive environment. For instance:
>
> <snip>
> <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
> <snip>
>
> invoking 34 619 callbacks will take time thus making other CFS
> tasks waiting in run-queue to be starved due to such behaviour.
>
> v1 -> v2:
> - fix the comment about the rcuc/rcub/rcuop;
> - check the kthread_prio against zero value;
> - by default the RCU_NOCB_CPU_CB_BOOST is ON for PREEMPT_RT.
>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>

Acked-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>

thanks,

- Joel


> ---
> kernel/rcu/Kconfig | 14 ++++++++++++++
> kernel/rcu/tree.c | 6 +++++-
> kernel/rcu/tree_nocb.h | 3 ++-
> 3 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> index 27aab870ae4c..a4ed7b5e2b75 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -275,6 +275,20 @@ config RCU_NOCB_CPU_DEFAULT_ALL
> Say Y here if you want offload all CPUs by default on boot.
> Say N here if you are unsure.
>
> +config RCU_NOCB_CPU_CB_BOOST
> + bool "Offload RCU callback from real-time kthread"
> + depends on RCU_NOCB_CPU && RCU_BOOST
> + default y if PREEMPT_RT
> + help
> + Use this option to offload callbacks from the SCHED_FIFO context
> + to make the process faster. As a side effect of this approach is
> + a latency especially for the SCHED_OTHER tasks which will not be
> + able to preempt an offloading kthread. That latency depends on a
> + number of callbacks to be invoked.
> +
> + Say Y here if you want to set RT priority for offloading kthreads.
> + Say N here if you are unsure.
> +
> config TASKS_TRACE_RCU_READ_MB
> bool "Tasks Trace RCU readers use memory barriers in user and idle"
> depends on RCU_EXPERT && TASKS_TRACE_RCU
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9dc4c4e82db6..1c3852b1e0c8 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -154,7 +154,11 @@ static void sync_sched_exp_online_cleanup(int cpu);
> static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
> static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
>
> -/* rcuc/rcub/rcuop kthread realtime priority */
> +/*
> + * rcuc/rcub/rcuop kthread realtime priority. The "rcuop"
> + * real-time priority(enabling/disabling) is controlled by
> + * the extra CONFIG_RCU_NOCB_CPU_CB_BOOST configuration.
> + */
> static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
> module_param(kthread_prio, int, 0444);
>
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 60cc92cc6655..fa8e4f82e60c 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
> if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
> goto end;
>
> - if (kthread_prio)
> + if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST) && kthread_prio)
> sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> +
> WRITE_ONCE(rdp->nocb_cb_kthread, t);
> WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
> return;
> --
> 2.30.2
>