Re: [PATCH 8/9] sched: Fix migrate_disable() vs set_cpus_allowed_ptr()

From: Peter Zijlstra
Date: Fri Sep 25 2020 - 05:05:47 EST


On Thu, Sep 24, 2020 at 08:59:33PM +0100, Valentin Schneider wrote:
> > @@ -2025,19 +2138,8 @@ static int __set_cpus_allowed_ptr(struct
> > if (cpumask_test_cpu(task_cpu(p), new_mask))
> > goto out;
>
> I think this needs a cancellation of any potential pending migration
> requests. Consider a task P0 running on CPU0:
>
> P0 P1 P2
>
> migrate_disable();
> <preempt>
> set_cpus_allowed_ptr(P0, CPU1);
> // waits for completion
> set_cpus_allowed_ptr(P0, CPU0);
> // Already good, no waiting for completion
> <resumes>
> migrate_enable();
> // task_cpu(p) allowed, no move_task()
>
> AIUI in this scenario P1 would stay forever waiting.

Hurmph, looking at it, I think you're right. But I'm fairly sure I did
test that, maybe I just didn't run it long enough to hit the window ...

> I *think* this can be
> cured by making this function slightly more hideous:

It's a real beauty isn't it :-/

> ---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 01113e6f941f..829334f00f7b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2102,6 +2102,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> u32 flags)
> {
> const struct cpumask *cpu_valid_mask = cpu_active_mask;
> + struct set_affinity_pending *pending;
> + bool cancel_pending = false;
> unsigned int dest_cpu;
> struct rq_flags rf;
> struct rq *rq;
> @@ -2158,14 +2160,20 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> }
>
> /* Can the task run on the task's current CPU? If so, we're done */
> - if (cpumask_test_cpu(task_cpu(p), new_mask))
> + if (cpumask_test_cpu(task_cpu(p), new_mask)) {
> + cancel_pending = true;
> goto out;
> + }
>
> return move_task(rq, &rf, p, dest_cpu, flags);
>
> out:
> + pending = p->migration_pending;
> task_rq_unlock(rq, p, &rf);
>
> + if (cancel_pending && pending)
> + complete_all(&pending->done);
> +
> return ret;
> }

He who completes pending should also clear ->migration_pending,
otherwise the next caller will be able to observe a dangling pointer.

The other approach is trying to handle that last condition in
move_task(), but I'm quite sure that's going to be aweful too :/