Re: [RFC PATCH 4/4] sched/core: Do numa balance in cfs_migration

From: Valentin Schneider
Date: Fri Nov 05 2021 - 13:02:39 EST


On 04/11/21 14:57, Yafang Shao wrote:
> Similar to active load balance, the numa balance work is also applied to
> cfs tasks only and it should't preempt other FIFO tasks. We'd better assign
> cfs_migration to the numa balance as well.
>
> Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> ---
> kernel/sched/core.c | 2 +-
> kernel/sched/fair.c | 13 +++++++++++++
> kernel/sched/sched.h | 2 ++
> 3 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9cb81ef8acc8..4a37b06715f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8724,7 +8724,7 @@ int migrate_task_to(struct task_struct *p, int target_cpu)
> /* TODO: This is not properly updating schedstats */
>
> trace_sched_move_numa(p, curr_cpu, target_cpu);
> - return stop_one_cpu(curr_cpu, migration_cpu_stop, &arg);
> + return wakeup_cfs_migrater(curr_cpu, migration_cpu_stop, &arg);

So that one I find really icky - migration_cpu_stop() really is meant to be
run from a CPU stopper (cf. cpu_stop suffix). IMO this is the opportunity
to make NUMA balancing reuse the logic for CFS active balance here, but per
previous email I'd say it could be done as a second step.

> }
>
> /*
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 932f63baeb82..b7a155e05c98 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11960,6 +11960,19 @@ static void wakeup_cfs_migrater_nowait(unsigned int cpu, cpu_stop_fn_t fn, void
> cfs_migration_queue_work(cpu, work_buf);
> }
>
> +bool wakeup_cfs_migrater(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
> +{
> + struct cpu_stop_done done;
> + struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = &done, .caller = _RET_IP_ };
> +
> + cpu_stop_init_done(&done, 1);
> + cfs_migration_queue_work(cpu, &work);
> + cond_resched();
> + wait_for_completion(&done.completion);
> +
> + return done.ret;
> +}
> +
> static int cfs_migration_should_run(unsigned int cpu)
> {
> struct cfs_migrater *migrater = &per_cpu(cfs_migrater, cpu);
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index a00fc7057d97..7b242c18a6d8 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -3055,6 +3055,8 @@ static inline bool is_per_cpu_kthread(struct task_struct *p)
>
> return true;
> }
> +
> +bool wakeup_cfs_migrater(unsigned int cpu, cpu_stop_fn_t fn, void *arg);
> #endif
>
> extern void swake_up_all_locked(struct swait_queue_head *q);
> --
> 2.17.1