Re: [PATCH -v4 6/8] locking/mutex: Restructure wait loop

From: Waiman Long
Date: Mon Oct 17 2016 - 19:17:53 EST


On 10/07/2016 10:52 AM, Peter Zijlstra wrote:
Doesn't really matter yet, but pull the HANDOFF and trylock out from
under the wait_lock.

The intention is to add an optimistic spin loop here, which requires
we do not hold the wait_lock, so shuffle code around in preparation.

Also clarify the purpose of taking the wait_lock in the wait loop, its
tempting to want to avoid it altogether, but the cancellation cases
need to to avoid losing wakeups.

Suggested-by: Waiman Long<waiman.long@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel)<peterz@xxxxxxxxxxxxx>
---
kernel/locking/mutex.c | 30 +++++++++++++++++++++++++-----
1 file changed, 25 insertions(+), 5 deletions(-)

--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -631,13 +631,21 @@ __mutex_lock_common(struct mutex *lock,

lock_contended(&lock->dep_map, ip);

+ set_task_state(task, state);

Do we want to set the state here? I am not sure if it is OK to set the task state without ever calling schedule().

for (;;) {
+ /*
+ * Once we hold wait_lock, we're serialized against
+ * mutex_unlock() handing the lock off to us, do a trylock
+ * before testing the error conditions to make sure we pick up
+ * the handoff.
+ */
if (__mutex_trylock(lock, first))
- break;
+ goto acquired;

/*
- * got a signal? (This code gets eliminated in the
- * TASK_UNINTERRUPTIBLE case.)
+ * Check for signals and wound conditions while holding
+ * wait_lock. This ensures the lock cancellation is ordered
+ * against mutex_unlock() and wake-ups do not go missing.
*/
if (unlikely(signal_pending_state(state, task))) {
ret = -EINTR;
@@ -650,16 +658,27 @@ __mutex_lock_common(struct mutex *lock,
goto err;
}

- __set_task_state(task, state);
spin_unlock_mutex(&lock->wait_lock, flags);
schedule_preempt_disabled();
- spin_lock_mutex(&lock->wait_lock, flags);

if (!first&& __mutex_waiter_is_first(lock,&waiter)) {
first = true;
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
}
+
+ set_task_state(task, state);

I would suggest keep the __set_task_state() above and change set_task_state(task, state) to set_task_state(task, TASK_RUNNING) to provide the memory barrier. Then we don't need adding __set_task_state() calls below.

+ /*
+ * Here we order against unlock; we must either see it change
+ * state back to RUNNING and fall through the next schedule(),
+ * or we must see its unlock and acquire.
+ */
+ if (__mutex_trylock(lock, first))
+ break;
+

I don't think we need a trylock here since we are going to do it at the top of the loop within wait_lock anyway.

+ spin_lock_mutex(&lock->wait_lock, flags);
}
+ spin_lock_mutex(&lock->wait_lock, flags);
+acquired:
__set_task_state(task, TASK_RUNNING);

mutex_remove_waiter(lock,&waiter, task);
@@ -682,6 +701,7 @@ __mutex_lock_common(struct mutex *lock,
return 0;

err:
+ __set_task_state(task, TASK_RUNNING);
mutex_remove_waiter(lock,&waiter, task);
spin_unlock_mutex(&lock->wait_lock, flags);
debug_mutex_free_waiter(&waiter);



Cheers,
Longman