Re: [PATCH v3 10/19] sched: Fix migrate_disable() vs set_cpus_allowed_ptr()
From: Peter Zijlstra
Date: Thu Oct 15 2020 - 10:20:18 EST
On Thu, Oct 15, 2020 at 02:54:53PM +0100, Valentin Schneider wrote:
>
> On 15/10/20 12:05, Peter Zijlstra wrote:
> > +static int affine_move_task(struct rq *rq, struct rq_flags *rf,
> > + struct task_struct *p, int dest_cpu, unsigned int flags)
> > +{
> > + struct set_affinity_pending my_pending = { }, *pending = NULL;
> > + struct migration_arg arg = {
> > + .task = p,
> > + .dest_cpu = dest_cpu,
> > + };
> > + bool complete = false;
> > +
> > + /* Can the task run on the task's current CPU? If so, we're done */
> > + if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
> > + pending = p->migration_pending;
> > + if (pending) {
> > + p->migration_pending = NULL;
> > + complete = true;
>
> Deciphering my TLA+ deadlock traces leads me to think this needs
>
> refcount_inc(&pending->refs);
>
> because the 'goto do_complete' leads us to an unconditional decrement.
Hurmm. I think you're right. I've updated the patch.