[PATCH v3 09/12] locking/ww_mutex: Re-check ww->ctx in the inner optimistic spin loop

From: Nicolai HÃhnle
Date: Wed Dec 21 2016 - 13:47:31 EST


From: Nicolai HÃhnle <Nicolai.Haehnle@xxxxxxx>

In the following scenario, thread #1 should back off its attempt to lock
ww1 and unlock ww2 (assuming the acquire context stamps are ordered
accordingly).

Thread #0 Thread #1
--------- ---------
successfully lock ww2
set ww1->base.owner
attempt to lock ww1
confirm ww1->ctx == NULL
enter mutex_spin_on_owner
set ww1->ctx

What was likely to happen previously is:

attempt to lock ww2
refuse to spin because
ww2->ctx != NULL
schedule()
detect thread #0 is off CPU
stop optimistic spin
return -EDEADLK
unlock ww2
wakeup thread #0
lock ww2

Now, we are more likely to see:

detect ww1->ctx != NULL
stop optimistic spin
return -EDEADLK
unlock ww2
successfully lock ww2

... because thread #1 will stop its optimistic spin as soon as possible.

The whole scenario is quite unlikely, since it requires thread #1 to get
between thread #0 setting the owner and setting the ctx. But since we're
idling here anyway, the additional check is basically free.

Found by inspection.

Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Maarten Lankhorst <dev@xxxxxxxxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Nicolai HÃhnle <Nicolai.Haehnle@xxxxxxx>
---
kernel/locking/mutex.c | 44 ++++++++++++++++++++++++++------------------
1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index c3f70dd..6f62695 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -373,7 +373,8 @@ ww_mutex_set_context_slowpath(struct ww_mutex *lock,
* access and not reliable.
*/
static noinline
-bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
+bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner,
+ bool use_ww_ctx, struct ww_acquire_ctx *ww_ctx)
{
bool ret = true;

@@ -396,6 +397,28 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
break;
}

+ if (use_ww_ctx && ww_ctx && ww_ctx->acquired > 0) {
+ struct ww_mutex *ww;
+
+ ww = container_of(lock, struct ww_mutex, base);
+
+ /*
+ * If ww->ctx is set the contents are undefined, only
+ * by acquiring wait_lock there is a guarantee that
+ * they are not invalid when reading.
+ *
+ * As such, when deadlock detection needs to be
+ * performed the optimistic spinning cannot be done.
+ *
+ * Check this in every inner iteration because we may
+ * be racing against another thread's ww_mutex_lock.
+ */
+ if (READ_ONCE(ww->ctx)) {
+ ret = false;
+ break;
+ }
+ }
+
cpu_relax();
}
rcu_read_unlock();
@@ -483,22 +506,6 @@ static bool mutex_optimistic_spin(struct mutex *lock,
for (;;) {
struct task_struct *owner;

- if (use_ww_ctx && ww_ctx && ww_ctx->acquired > 0) {
- struct ww_mutex *ww;
-
- ww = container_of(lock, struct ww_mutex, base);
- /*
- * If ww->ctx is set the contents are undefined, only
- * by acquiring wait_lock there is a guarantee that
- * they are not invalid when reading.
- *
- * As such, when deadlock detection needs to be
- * performed the optimistic spinning cannot be done.
- */
- if (READ_ONCE(ww->ctx))
- goto fail_unlock;
- }
-
/*
* If there's an owner, wait for it to either
* release the lock or go to sleep.
@@ -510,7 +517,8 @@ static bool mutex_optimistic_spin(struct mutex *lock,
break;
}

- if (!mutex_spin_on_owner(lock, owner))
+ if (!mutex_spin_on_owner(lock, owner, use_ww_ctx,
+ ww_ctx))
goto fail_unlock;
}

--
2.7.4