Re: [RFC][PATCH 2/8] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

From: Peter Zijlstra
Date: Tue Jun 14 2016 - 08:30:36 EST


On Tue, Jun 14, 2016 at 11:21:09AM +0100, Juri Lelli wrote:
> > [XXX this next section is unparsable]
>
> Yes, a bit hard to understand. However, am I correct in assuming this
> patch and the previous one should fix this problem? Or are there still
> other races causing issues?

I think so; so there were two related problems,

1) top_waiter was used outside its serialization
2) a race against the top waiter task and sched_setscheduler() changing
its state

Now, I could not understand a word of that marked paragraph, but from my
understanding of the code both are solved.

1) by keeping the top_pi_task cache updated under pi_lock and rq->lock,
thereby ensuring that holding either is sufficient to stabilize it.

2) sched_setscheduler() can change the parameters of the top_pi_task,
but since it too holds both pi_lock and rq->lock, it cannot happen at
the same time that we're looking at the cached top pi waiter pointer
thingy.

It can however happen that top_pi_waiter is not in fact the top waiter
in a narrow window between sched_setscheduler() changing its parameters
and rt_mutex_adjust_pi() re-ordering the PI chain - ending in updating
the cached top task pointer thingy.