Re: [RFC PATCH 5/6] sched/fair: trigger the update of blocked load on newly idle cpu

From: Valentin Schneider
Date: Tue Feb 09 2021 - 08:10:41 EST


On 05/02/21 12:48, Vincent Guittot wrote:
> Instead of waking up a random and already idle CPU, we can take advantage
> of this_cpu being about to enter idle to run the ILB and update the
> blocked load.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
> include/linux/sched/nohz.h | 2 ++
> kernel/sched/fair.c | 11 ++++++++---
> kernel/sched/idle.c | 6 ++++++
> 3 files changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/sched/nohz.h b/include/linux/sched/nohz.h
> index 6d67e9a5af6b..74cdc4e87310 100644
> --- a/include/linux/sched/nohz.h
> +++ b/include/linux/sched/nohz.h
> @@ -9,8 +9,10 @@
> #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
> extern void nohz_balance_enter_idle(int cpu);
> extern int get_nohz_timer_target(void);
> +extern void nohz_run_idle_balance(int cpu);
> #else
> static inline void nohz_balance_enter_idle(int cpu) { }
> +static inline void nohz_run_idle_balance(int cpu) { }
> #endif
>
> #ifdef CONFIG_NO_HZ_COMMON
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 935594cd5430..3d2ab28d5736 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10461,6 +10461,11 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)
> return true;
> }
>
> +void nohz_run_idle_balance(int cpu)
> +{
> + nohz_idle_balance(cpu_rq(cpu), CPU_IDLE);
> +}
> +
> static void nohz_newidle_balance(struct rq *this_rq)
> {
> int this_cpu = this_rq->cpu;
> @@ -10482,10 +10487,10 @@ static void nohz_newidle_balance(struct rq *this_rq)
> return;
>
> /*
> - * Blocked load of idle CPUs need to be updated.
> - * Kick an ILB to update statistics.
> + * Set the need to trigger ILB in order to update blocked load
> + * before entering idle state.
> */
> - kick_ilb(NOHZ_STATS_KICK);
> + this_rq->nohz_idle_balance = NOHZ_STATS_KICK;
> }
>
> #else /* !CONFIG_NO_HZ_COMMON */
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 305727ea0677..52a4e9ce2f9b 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -261,6 +261,12 @@ static void cpuidle_idle_call(void)
> static void do_idle(void)
> {
> int cpu = smp_processor_id();
> +
> + /*
> + * Check if we need to update some blocked load
> + */
> + nohz_run_idle_balance(cpu);
> +

What do we gain from doing this here vs having a stats update in
newidle_balance()?

The current approach is to have a combined load_balance() + blocked load
update during newidle, and I get that this can take too long. But then,
we could still have what you're adding to do_idle() in the tail of
newidle_balance() itself, no? i.e.

newidle_balance()
...
for_each_domain(this_cpu, sd) {
...
pulled_task = load_balance(...);
...
}
...
if (!pulled_task && !this_rq->nr_running) {
this_rq->nohz_idle_balance = NOHZ_STATS_KICK;
_nohz_idle_balance();
}

or somesuch.

> /*
> * If the arch has a polling bit, we maintain an invariant:
> *
> --
> 2.17.1