[PATCH v2] locking/rwsem: Disable preemption while trying for rwsem lock

From: Mukesh Ojha
Date: Thu Sep 08 2022 - 14:24:54 EST

From: Gokul krishna Krishnakumar <quic_gokukris@xxxxxxxxxxx>

Make the region inside the rwsem_write_trylock non preemptible.

We observe RT task is hogging CPU when trying to acquire rwsem lock
which was acquired by a kworker task but before the rwsem owner was set.

Here is the scenario:
1. CFS task (affined to a particular CPU) takes rwsem lock.

2. CFS task gets preempted by a RT task before setting owner.

3. RT task (FIFO) is trying to acquire the lock, but spinning until
RT throttling happens for the lock as the lock was taken by CFS task.

This patch attempts to fix the above issue by disabling preemption
until owner is set for the lock. While at it also fix the issues
at the places where rwsem_{set,clear}_owner() are called.

This also adds lockdep annotation of preemption disable in
rwsem_{set,clear}_owner() on Peter Z. suggestion.

Signed-off-by: Gokul krishna Krishnakumar <quic_gokukris@xxxxxxxxxxx>
Signed-off-by: Mukesh Ojha <quic_mojha@xxxxxxxxxxx>
Changes in v2:
- Remove preempt disable code in rwsem_try_write_lock_unqueued()
- Addressed suggestion from Peter Z.
- Modified commit text
kernel/locking/rwsem.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 65f0262..4487359 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -133,14 +133,19 @@
* the owner value concurrently without lock. Read from owner, however,
* may not need READ_ONCE() as long as the pointer value is only used
* for comparison and isn't being dereferenced.
+ *
+ * Both rwsem_{set,clear}_owner() functions should be in the same
+ * preempt disable section as the atomic op that changes sem->count.
static inline void rwsem_set_owner(struct rw_semaphore *sem)
+ lockdep_assert_preemption_disabled();
atomic_long_set(&sem->owner, (long)current);

static inline void rwsem_clear_owner(struct rw_semaphore *sem)
+ lockdep_assert_preemption_disabled();
atomic_long_set(&sem->owner, 0);

@@ -251,13 +256,16 @@ static inline bool rwsem_read_trylock(struct rw_semaphore *sem, long *cntp)
static inline bool rwsem_write_trylock(struct rw_semaphore *sem)
+ bool ret = false;

+ preempt_disable();
if (atomic_long_try_cmpxchg_acquire(&sem->count, &tmp, RWSEM_WRITER_LOCKED)) {
- return true;
+ ret = true;

- return false;
+ preempt_enable();
+ return ret;

@@ -1352,8 +1360,10 @@ static inline void __up_write(struct rw_semaphore *sem)
DEBUG_RWSEMS_WARN_ON((rwsem_owner(sem) != current) &&
!rwsem_test_oflags(sem, RWSEM_NONSPINNABLE), sem);

+ preempt_disable();
tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count);
+ preempt_enable();
if (unlikely(tmp & RWSEM_FLAG_WAITERS))