Re: [RFC][PATCH 3/3] locking/mutex: Add lock handoff to avoid starvation

From: Waiman Long
Date: Wed Aug 24 2016 - 15:50:24 EST


On 08/23/2016 04:32 PM, Peter Zijlstra wrote:
On Tue, Aug 23, 2016 at 03:47:53PM -0400, Waiman Long wrote:
On 08/23/2016 08:46 AM, Peter Zijlstra wrote:
N
@@ -573,8 +600,14 @@ __mutex_lock_common(struct mutex *lock,
schedule_preempt_disabled();
spin_lock_mutex(&lock->wait_lock, flags);

+ if (__mutex_owner(lock) == current)
+ break;
+
if (__mutex_trylock(lock))
break;
+
+ if (__mutex_waiter_is_first(lock,&waiter))
+ __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
}
__set_task_state(task, TASK_RUNNING);


You may want to think about doing some spinning while the owner is active
instead of going back to sleep again here.
For sure; I just didn't bother pulling in your patches. I didn't want to
sink in more time in case people really hated on 1/3 ;-)

I think there is race in how the handoff is being done.

CPU 0 CPU 1 CPU 2

----- ----- -----

__mutex_lock_common: mutex_optimistic_spin:

__mutex_trylock()

mutex_unlock:

if (owner&
MUTEX_FLAG_HANDOFF)

owner&= 0x3;

__mutex_trylock();

owner = CPU2;

__mutex_set_flag(lock,

MUTEX_FLAG_HANDOFF)

__mutex_unlock_slowpath:

__mutex_handoff:

owner = CPU0;


Now both CPUs 1 and 2 think they have the lock. One way to fix that is
to check if the owner is still the original lock holder (CPU 0) before
doing the handoff, like:

--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -97,6 +97,8 @@ static void __mutex_handoff(struct mutex *lock, struct task_st
for (;;) {
unsigned long old, new;

+ if ((owner & ~MUTEX_FLAG_ALL) != current)
+ break;
new = (owner & MUTEX_FLAG_WAITERS);
new |= (unsigned long)task;

I also think that the MUTEX_FLAG_HANDOFF bit needs to be cleared if the list
is empty.

@@ -614,7 +633,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned
mutex_remove_waiter(lock, &waiter, task);
/* set it to 0 if there are no waiters left: */
if (likely(list_empty(&lock->wait_list)))
- __mutex_clear_flag(lock, MUTEX_FLAG_WAITERS);
+ __mutex_clear_flag(lock, MUTEX_FLAG_WAITERS|MUTEX_FLAG_HANDOFF);

Or we should try to reset the handoff bit after the while loop exit if the bit is still set.

Cheers,
Longman