Re: [patch 1/6] rtmutex: Fix deadlock detector for real

From: Steven Rostedt
Date: Tue May 27 2014 - 18:09:38 EST


On Thu, 22 May 2014 03:25:39 -0000
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> The current deadlock detection logic does not work reliably due to the
> following early exit path:
>
> /*
> * Drop out, when the task has no waiters. Note,
> * top_waiter can be NULL, when we are in the deboosting
> * mode!
> */
> if (top_waiter && (!task_has_pi_waiters(task) ||
> top_waiter != task_top_pi_waiter(task)))
> goto out_unlock_pi;
>
> So this not only exits when the task has no waiters, it also exits
> unconditionally when the current waiter is not the top priority waiter
> of the task.
>
> So in a nested locking scenario, it might abort the lock chain walk
> and therefor miss a potential deadlock.
>
> Simple fix: Continue the chain walk, when deadlock detection is
> enabled.
>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
> kernel/locking/rtmutex.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -343,16 +343,22 @@ static int rt_mutex_adjust_prio_chain(st
> * top_waiter can be NULL, when we are in the deboosting
> * mode!
> */
> - if (top_waiter && (!task_has_pi_waiters(task) ||
> - top_waiter != task_top_pi_waiter(task)))
> - goto out_unlock_pi;
> + if (top_waiter) {
> + if (!task_has_pi_waiters(task))
> + goto out_unlock_pi;
> +
> + if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
> + goto out_unlock_pi;
> + }

The above seems obvious.

>
> /*
> * When deadlock detection is off then we check, if further
> * priority adjustment is necessary.
> */
> - if (!detect_deadlock && waiter->prio == task->prio)
> - goto out_unlock_pi;
> + if (waiter->prio == task->prio) {
> + if (!detect_deadlock)
> + goto out_unlock_pi;
> + }

This too.

Although! if you want to micro-optimize the detect_deadlock case
where !detect_deadlock is false. You might want to reverse the order.
That way we don't need to dereference the ->prio for both waiter and
task before seeing that we don't go to the out_unlock_pi.

if (!detect_deadlock) {
if (waiter->prio == task->prio)
goto out_unlock_pi;
}

Hmm, or you did it this way for your "don't requeue" patch? Looking at
that one, it seems you did.

if (waiter->prio == task->prio) {
if (!detect_deadlock)
goto out_unlock_pi;
requeue = false;
}

Oh well. But for stable maybe have the optimized way? And change it
back when you add the requeue patch?

>
> lock = waiter->lock;
> if (!raw_spin_trylock(&lock->wait_lock)) {
> @@ -361,7 +367,12 @@ static int rt_mutex_adjust_prio_chain(st
> goto retry;
> }
>
> - /* Deadlock detection */
> + /*
> + * Deadlock detection. If the lock is the same as the original
> + * lock which caused us to walk the lock chain or if the
> + * current lock is owned by the task which initiated the chain
> + * walk, we detected a deadlock.
> + */
> if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
> raw_spin_unlock(&lock->wait_lock);
> @@ -527,6 +538,10 @@ static int task_blocks_on_rt_mutex(struc
> unsigned long flags;
> int chain_walk = 0, res;
>
> + /* Early deadlock detection */
> + if (detect_deadlock && owner == task)
> + return -EDEADLK;
> +

This is an optimization, right? Does it belong for stable?

-- Steve

> raw_spin_lock_irqsave(&task->pi_lock, flags);
> __rt_mutex_adjust_prio(task);
> waiter->task = task;
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/