Re: [PATCH] sched/core: simpler function for sched_exec migration

From: Oleg Nesterov
Date: Mon Sep 05 2016 - 09:11:56 EST


On 09/05, cheng chao wrote:
>
> @@ -2958,7 +2958,7 @@ void sched_exec(void)
> struct migration_arg arg = { p, dest_cpu };
>
> raw_spin_unlock_irqrestore(&p->pi_lock, flags);
> - stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
> + stop_one_cpu_sync(task_cpu(p), migration_cpu_stop, &arg);
> return;
> }
> unlock:
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 4a1ca5f..24f8637 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -130,6 +130,27 @@ int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
> return done.ret;
> }
>
> +/**
> + * the caller keeps task_on_rq_queued, so it's more suitable for
> + * sched_exec on the case when needs migration
> + */
> +void stop_one_cpu_sync(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
> +{
> + struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = NULL };
> +
> + if (!cpu_stop_queue_work(cpu, &work))
> + return;
> +
> +#if defined(CONFIG_PREEMPT_NONE) || defined(CONFIG_PREEMPT_VOLUNTARY)
> + /*
> + * CONFIG_PREEMPT doesn't need call schedule here, because
> + * preempt_enable already does the similar thing when call
> + * cpu_stop_queue_work
> + */
> + schedule();
> +#endif
> +}

Honestly, I don't really understand the changelog, but this looks wrong.

stop_one_cpu_sync() assumes that cpu == smp_processor_id/task_cpu(current),
and thus the stopper thread should preempt us at least after schedule()
(if CONFIG_PREEMPT_NONE), so we do not need to synchronize.

But this is not necessarily true? This task can migrate to another CPU
before cpu_stop_queue_work() ?

Oleg.