Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
From: Daniel Vetter
Date: Wed Nov 23 2016 - 07:51:00 EST
On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> From: Nicolai Hähnle <Nicolai.Haehnle@xxxxxxx>
>
> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> the following example. Acquire context stamps are ordered like the thread
> numbers, i.e. thread #1 should back off when it encounters a mutex locked
> by thread #0 etc.
>
> Thread #0 Thread #1 Thread #2 Thread #3
> --------- --------- --------- ---------
> lock(ww)
> success
> lock(ww')
> success
> lock(ww)
> lock(ww) .
> . . unlock(ww) part 1
> lock(ww) . . .
> success . . .
> . . unlock(ww) part 2
> . back off
> lock(ww') .
> . .
> (stuck) (stuck)
>
> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> (without being protected by lock->base.wait_lock), meaning that thread #0
> can acquire ww in the fast path or, much more likely, the medium path
> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
>
> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> thread #2, since waiters are added at the tail. Thread #2 wakes up and
> backs off since it sees ww owned by a context with a lower stamp.
>
> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> on ww'. So thread #0 gets stuck waiting for ww' to be released.
>
> This patch fixes the deadlock by waking up all waiters in the slow path
> of ww_mutex_unlock.
>
> We have an internal test case for amdgpu which continuously submits
> command streams from tens of threads, where all command streams reference
> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> between command streams. This test reliably caused a deadlock, and while I
> haven't completely confirmed that it is exactly the scenario outlined
> above, this patch does fix the test case.
>
> v2:
> - use wake_q_add
> - add additional explanations
>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx>
> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> Cc: stable@xxxxxxxxxxxxxxx
> Reviewed-by: Christian König <christian.koenig@xxxxxxx> (v1)
> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@xxxxxxx>
Yeah, when the owning ctx changes we need to wake up all waiters, to make
sure we catch all (new) deadlock scenarios. And I tried poking at your
example, and I think it's solid and can't be minimized any further. I
don't have much clue on mutex.c code itself, but the changes seem
reasonable. With that caveat:
Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cheers, Daniel
> ---
> kernel/locking/mutex.c | 33 +++++++++++++++++++++++++++++----
> 1 file changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index a70b90d..7fbf9b4 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -409,6 +409,9 @@ static bool mutex_optimistic_spin(struct mutex *lock,
> __visible __used noinline
> void __sched __mutex_unlock_slowpath(atomic_t *lock_count);
>
> +static __used noinline
> +void __sched __mutex_unlock_slowpath_wakeall(atomic_t *lock_count);
> +
> /**
> * mutex_unlock - release the mutex
> * @lock: the mutex to be released
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
> */
> mutex_clear_owner(&lock->base);
> #endif
> - __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> + /*
> + * A previously _not_ waiting task may acquire the lock via the fast
> + * path during our unlock. In that case, already waiting tasks may have
> + * to back off to avoid a deadlock. Wake up all waiters so that they
> + * can check their acquire context stamp against the new owner.
> + */
> + __mutex_fastpath_unlock(&lock->base.count,
> + __mutex_unlock_slowpath_wakeall);
> }
> EXPORT_SYMBOL(ww_mutex_unlock);
>
> @@ -716,7 +726,7 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
> * Release the lock, slowpath:
> */
> static inline void
> -__mutex_unlock_common_slowpath(struct mutex *lock, int nested)
> +__mutex_unlock_common_slowpath(struct mutex *lock, int nested, int wake_all)
> {
> unsigned long flags;
> WAKE_Q(wake_q);
> @@ -740,7 +750,14 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
> mutex_release(&lock->dep_map, nested, _RET_IP_);
> debug_mutex_unlock(lock);
>
> - if (!list_empty(&lock->wait_list)) {
> + if (wake_all) {
> + struct mutex_waiter *waiter;
> +
> + list_for_each_entry(waiter, &lock->wait_list, list) {
> + debug_mutex_wake_waiter(lock, waiter);
> + wake_q_add(&wake_q, waiter->task);
> + }
> + } else if (!list_empty(&lock->wait_list)) {
> /* get the first entry from the wait-list: */
> struct mutex_waiter *waiter =
> list_entry(lock->wait_list.next,
> @@ -762,7 +779,15 @@ __mutex_unlock_slowpath(atomic_t *lock_count)
> {
> struct mutex *lock = container_of(lock_count, struct mutex, count);
>
> - __mutex_unlock_common_slowpath(lock, 1);
> + __mutex_unlock_common_slowpath(lock, 1, 0);
> +}
> +
> +static void
> +__mutex_unlock_slowpath_wakeall(atomic_t *lock_count)
> +{
> + struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> + __mutex_unlock_common_slowpath(lock, 1, 1);
> }
>
> #ifndef CONFIG_DEBUG_LOCK_ALLOC
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch