[PATCH v3 0/3] locking/mutex: Enable optimistic spinning of lock waiter

From: Waiman Long
Date: Tue Mar 22 2016 - 13:47:50 EST


v2->v3:
- Remove patch 4 as it is not useful.
- Allow need_resched() check for waiter & add more comments about
changes to address issues raised by PeterZ.

v1->v2:
- Set task state to running before doing optimistic spinning.
- Add 2 more patches to handle possible missed wakeups and wasteful
spinning in try_to_wake_up() function.

This patchset is a variant of PeterZ's "locking/mutex: Avoid spinner
vs waiter starvation" patch. The major difference is that the
waiter-spinner won't enter into the OSQ used by the spinners. Instead,
it will spin directly on the lock in parallel with the queue head
of the OSQ. So there will be a bit more cacheline contention on the
lock cacheline, but that shouldn't cause noticeable impact on system
performance.

This patchset tries to address 2 issues with Peter's patch:

1) Ding Tianhong still find that hanging task could happen in some cases.
2) Jason Low found that there was performance regression for some AIM7
workloads.

By making the waiter-spinner to spin directly on the mutex, it will
increase the chance for the waiter-spinner to get the lock instead
of waiting in the OSQ for its turn.

Patch 1 modifies the mutex_optimistic_spin() function to enable it
to be called by a waiter-spinner that doesn't need to go into the OSQ.

Patch 2 modifies the mutex locking slowpath to make the waiter call
mutex_optimistic_spin() to do spinning after being waken up.

Patch 3 reverses the sequence of setting task state and changing
mutex count to -1 to prevent the possibility of missed wakeup.

Patch 4 modifies the wakeup code to abandon the wakeup operation
while spinning on the on_cpu flag if the task has changed back to a
non-sleeping state.

My own test on a 4-socket E7-4820 v3 system showed a regression of
about 4% in the high_systime workload with Peter's patch which this
new patch effectively eliminates.

Testing on an 8-socket Westmere-EX server, however, has performance
change from -9% to than +140% on the fserver workload of AIM7
depending on how the system was set up.

Waiman Long (3):
locking/mutex: Add waiter parameter to mutex_optimistic_spin()
locking/mutex: Enable optimistic spinning of woken task in wait queue
locking/mutex: Avoid missed wakeup of mutex waiter

kernel/locking/mutex.c | 126 ++++++++++++++++++++++++++++++++++--------------
1 files changed, 90 insertions(+), 36 deletions(-)