Re: [PATCH v3] locking/rwsem: reduce spinlock contention in wakeup after up_read/up_write

From: Waiman Long
Date: Mon Apr 27 2015 - 16:25:20 EST

On 04/24/2015 04:39 PM, Davidlohr Bueso wrote:
On Fri, 2015-04-24 at 13:54 -0400, Waiman Long wrote:
This patch also checks one more time in __rwsem_do_wake() to see if
the rwsem was stolen just before doing the expensive wakeup operation
which will be unnecessary if the lock was stolen.
It strikes me that this should be another patch, as the optimization is
independent of the wake_lock (comments below).


I could separate that into a separate patch.

Could you please reuse the CONFIG_RWSEM_SPIN_ON_OWNER ifdefiry we
already have? Just add these where we define rwsem_spin_on_owner().

I can move the rwsem_has_spinner() down there, but rwsem_has_active_writer() has to be on top or I will need to add a forward reference to it.


* handle the lock release when processes blocked on it that can now run
* - if we come here from up_xxxx(), then:
@@ -125,6 +154,14 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type)
struct list_head *next;
long oldcount, woken, loop, adjustment;

+ /*
+ * up_write() cleared the owner field before calling this function.
+ * If that field is now set, a writer must have stolen the lock and
+ * the wakeup operation should be aborted.
+ */
+ if (rwsem_has_active_writer(sem))
+ goto out;
We currently allow small races between rwsem owner and counter checks.
And __rwsem_do_wake() can be called by checking the former -- and lock
stealing is done with the counter as well. Please see below how we back
out of such cases, as it is very much considered when granting the next
reader. So nack to this as is, sorry.

If the first one in the queue is a writer, wake_up_process() may be called directly which can be quite expensive if the lock has already been stolen as the task will have to sleep again. Even for reader, the counter check is by mean of an atomic instruction which can cost hundreds of CPU cycles, the rwsem_has_active_writer() check, however, is just a few additional cycles on top of the rwsem cacheline read which is needed by the next list_entry() function call anyway. I consider this a low-cost way to save hundreds of wasted CPU cycles in case the lock is stolen and the owner set.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at