Re: [PATCH] sched/dl: Fix race between dl_task_timer() and sched_setaffinity()
From: Peter Zijlstra
Date: Tue May 20 2014 - 03:53:24 EST
On Tue, May 20, 2014 at 09:08:53AM +0400, Kirill Tkhai wrote:
>
>
> 20.05.2014, 04:00, "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>:
> > On Mon, May 19, 2014 at 11:31:19PM +0400, Kirill Tkhai wrote:
> >
> >> @@ -513,9 +513,17 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
> >> struct sched_dl_entity,
> >> dl_timer);
> >> struct task_struct *p = dl_task_of(dl_se);
> >> - struct rq *rq = task_rq(p);
> >> + struct rq *rq;
> >> +again:
> >> + rq = task_rq(p);
> >> raw_spin_lock(&rq->lock);
> >>
> >> + if (unlikely(rq != task_rq(p))) {
> >> + /* Task was moved, retrying. */
> >> + raw_spin_unlock(&rq->lock);
> >> + goto again;
> >> + }
> >> +
> >
> > That thing is called: rq = __task_rq_lock(p);
>
> But p->pi_lock is not held. The problem is __task_rq_lock() has lockdep assert.
> Should we change it?
Ok, so now that I'm awake ;-)
So the trivial problem as described by your initial changelog isn't
right, because we cannot call sched_setaffinity() on deadline tasks, or
rather we can, but we can't actually change the affinity mask.
Now I suppose the problem can still actually happen when you change the
root domain and trigger a effective affinity change that way.
That said, no leave it as you proposed, adding a *task_rq_lock() variant
without lockdep assert in will only confuse things, as normally we
really should be also taking ->pi_lock.
The only reason we don't strictly need ->pi_lock now is because we're
guaranteed to have p->state == TASK_RUNNING here and are thus free of
ttwu races.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/