Re: [PATCH v3 1/4] sched/rt: Check to push the task away after its affinity was changed

From: Peter Zijlstra
Date: Sat May 30 2015 - 04:20:52 EST


On Fri, May 29, 2015 at 10:04:36PM +0800, pang.xunlei@xxxxxxxxxx wrote:
> Hi Peter,
>
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote 2015-05-29 PM 09:16:26:
> >
> > Re: [PATCH v3 1/4] sched/rt: Check to push the task away after its
> > affinity was changed
> >
> > On Tue, May 12, 2015 at 10:46:41PM +0800, Xunlei Pang wrote:
> > > @@ -2278,6 +2279,20 @@ static void set_cpus_allowed_rt(struct
> > task_struct *p,
> > > }
> > >
> > > update_rt_migration(&rq->rt);
> > > +
> > > +check_push:
> > > + if (weight > 1 &&
> > > + !task_running(rq, p) &&
> > > + !test_tsk_need_resched(rq->curr) &&
> > > + !cpumask_subset(new_mask, &p->cpus_allowed)) {
> > > + /* Update new affinity and try to push. */
> > > + cpumask_copy(&p->cpus_allowed, new_mask);
> > > + p->nr_cpus_allowed = weight;
> > > + push_rt_tasks(rq);
> > > + return true;
> > > + }
> > > +
> > > + return false;
> > > }
> >
> > I think this is broken; push_rt_tasks() will do double_rq_lock() which
> > will drop rq->lock.
> >
> > This means load-balancing can come in and move our task p; in fact,
> > push_rt_task() can do exactly that -- after all that was the point of
> > this patch.
> >
> > _However_ this means that after calling ->set_cpus_allowed() we must not
> > assume @p is on @rt, yet we do. Look at __set_cpus_allowed_ptr(), we'll
> > call move_queued_task() if (!running || waking) && on_rq, and
> > move_queued_task() happily calls dequeue_task(rq, p), which will go
> > *boom*.
>
> I can't see why this can happen?
>
> After finishing set_cpus_allowed_rt(), if there happens a successful
> load-balancing (pull or push) action, new task_cpu(@p) will be set,
> so we will definitely get the following true condition:
>
> /* Can the task run on the task's current CPU? If so, we're done
> */
> if (cpumask_test_cpu(task_cpu(p), new_mask))
> goto out;
>
> So I think the whole function will simply go out and return normally.

Humm, yes. Missed that. That makes it work by accident; because you
didn't document/Changelog any of this.

Makes me like the thing even less though..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/