[tip:locking/core] locking/rwsem: More optimal RT task handling of null owner

From: tip-bot for Waiman Long
Date: Mon Jun 17 2019 - 10:34:35 EST


Commit-ID: 990fa7384a3057a3298bcf493651c6e14416c47c
Gitweb: https://git.kernel.org/tip/990fa7384a3057a3298bcf493651c6e14416c47c
Author: Waiman Long <longman@xxxxxxxxxx>
AuthorDate: Mon, 20 May 2019 16:59:08 -0400
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 17 Jun 2019 12:28:01 +0200

locking/rwsem: More optimal RT task handling of null owner

An RT task can do optimistic spinning only if the lock holder is
actually running. If the state of the lock holder isn't known, there
is a possibility that high priority of the RT task may block forward
progress of the lock holder if it happens to reside on the same CPU.
This will lead to deadlock. So we have to make sure that an RT task
will not spin on a reader-owned rwsem.

When the owner is temporarily set to NULL, there are two cases
where we may want to continue spinning:

1) The lock owner is in the process of releasing the lock, sem->owner
is cleared but the lock has not been released yet.

2) The lock was free and owner cleared, but another task just comes
in and acquire the lock before we try to get it. The new owner may
be a spinnable writer.

So an RT task is now made to retry one more time to see if it can
acquire the lock or continue spinning on the new owning writer.

When testing on a 8-socket IvyBridge-EX system, the one additional retry
seems to improve locking performance of RT write locking threads under
heavy contentions. The table below shows the locking rates (in kops/s)
with various write locking threads before and after the patch.

Locking threads Pre-patch Post-patch
--------------- --------- -----------
4 2,753 2,608
8 2,529 2,520
16 1,727 1,918
32 1,263 1,956
64 889 1,343

Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: huang ying <huang.ying.caritas@xxxxxxxxx>
Link: https://lkml.kernel.org/r/20190520205918.22251-10-longman@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/locking/rwsem.c | 51 +++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 5532304406f7..e1840b7c5310 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -566,6 +566,7 @@ static noinline enum owner_state rwsem_spin_on_owner(struct rw_semaphore *sem)
static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
{
bool taken = false;
+ int prev_owner_state = OWNER_NULL;

preempt_disable();

@@ -583,7 +584,12 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
* 2) readers own the lock as we can't determine if they are
* actively running or not.
*/
- while (rwsem_spin_on_owner(sem) & OWNER_SPINNABLE) {
+ for (;;) {
+ enum owner_state owner_state = rwsem_spin_on_owner(sem);
+
+ if (!(owner_state & OWNER_SPINNABLE))
+ break;
+
/*
* Try to acquire the lock
*/
@@ -593,13 +599,44 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
}

/*
- * When there's no owner, we might have preempted between the
- * owner acquiring the lock and setting the owner field. If
- * we're an RT task that will live-lock because we won't let
- * the owner complete.
+ * An RT task cannot do optimistic spinning if it cannot
+ * be sure the lock holder is running or live-lock may
+ * happen if the current task and the lock holder happen
+ * to run in the same CPU. However, aborting optimistic
+ * spinning while a NULL owner is detected may miss some
+ * opportunity where spinning can continue without causing
+ * problem.
+ *
+ * There are 2 possible cases where an RT task may be able
+ * to continue spinning.
+ *
+ * 1) The lock owner is in the process of releasing the
+ * lock, sem->owner is cleared but the lock has not
+ * been released yet.
+ * 2) The lock was free and owner cleared, but another
+ * task just comes in and acquire the lock before
+ * we try to get it. The new owner may be a spinnable
+ * writer.
+ *
+ * To take advantage of two scenarios listed agove, the RT
+ * task is made to retry one more time to see if it can
+ * acquire the lock or continue spinning on the new owning
+ * writer. Of course, if the time lag is long enough or the
+ * new owner is not a writer or spinnable, the RT task will
+ * quit spinning.
+ *
+ * If the owner is a writer, the need_resched() check is
+ * done inside rwsem_spin_on_owner(). If the owner is not
+ * a writer, need_resched() check needs to be done here.
*/
- if (!sem->owner && (need_resched() || rt_task(current)))
- break;
+ if (owner_state != OWNER_WRITER) {
+ if (need_resched())
+ break;
+ if (rt_task(current) &&
+ (prev_owner_state != OWNER_WRITER))
+ break;
+ }
+ prev_owner_state = owner_state;

/*
* The cpu_relax() call is a compiler barrier which forces