Re: [PATCH] sched/deadline/rtmutex: Fix a PI crash for deadline tasks

From: Xunlei Pang
Date: Tue Apr 05 2016 - 06:48:48 EST


On 2016/04/05 at 17:29, Peter Zijlstra wrote:
> On Tue, Apr 05, 2016 at 11:19:54AM +0200, Peter Zijlstra wrote:
>> Or did I miss something (again) ? :-)
>>
>> ---
>> kernel/locking/rtmutex.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
>> index 3e746607abe5..36eb232bd29f 100644
>> --- a/kernel/locking/rtmutex.c
>> +++ b/kernel/locking/rtmutex.c
>> @@ -1390,11 +1390,11 @@ rt_mutex_fastunlock(struct rt_mutex *lock,
>> } else {
>> bool deboost = slowfn(lock, &wake_q);
>>
>> - wake_up_q(&wake_q);
>> -
>> /* Undo pi boosting if necessary: */
>> if (deboost)
>> rt_mutex_adjust_prio(current);
>> +
>> + wake_up_q(&wake_q);
>> }
>> }
> So one potential issue with this -- and this might be reason this code
> is the way it is -- is that the moment we de-boost we can get preempted,
> before having had a chance to wake the higher prio task, getting
> ourselves into a prio-inversion.
>
> But again, that should be fairly simply to fix.

This is cool, I think we should also init "pi_task" properly for INIT_MUTEX and fork,
otherwise looks good to me :-)

Besides, do you think we can kill "pi_waiters_leftmost" from task_struct, as we
can easily get it from "pi_waiters"?

I will test it further with these new changes soon.

Regards,
Xunlei

>
> --
> kernel/locking/rtmutex.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> index 3e746607abe5..1896baf28e9c 100644
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -1390,11 +1390,21 @@ rt_mutex_fastunlock(struct rt_mutex *lock,
> } else {
> bool deboost = slowfn(lock, &wake_q);
>
> - wake_up_q(&wake_q);
> -
> - /* Undo pi boosting if necessary: */
> + /*
> + * Undo pi boosting (if necessary) and wake top waiter.
> + *
> + * We should deboost before waking the high-prio task such that
> + * we don't run two tasks with the 'same' state. This however
> + * can lead to prio-inversion if we would get preempted after
> + * the deboost but before waking our high-prio task, hence the
> + * preempt_disable.
> + */
> + preempt_disable();
> if (deboost)
> rt_mutex_adjust_prio(current);
> +
> + wake_up_q(&wake_q);
> + preempt_enable();
> }
> }
>