Re: [PATCH] locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath

From: Waiman Long
Date: Mon Oct 10 2022 - 12:57:48 EST


On 10/10/22 06:24, Mukesh Ojha wrote:
Hi Waiman,


On 9/29/2022 11:36 PM, Waiman Long wrote:
On 9/29/22 14:04, Waiman Long wrote:
A non-first waiter can potentially spin in the for loop of
rwsem_down_write_slowpath() without sleeping but fail to acquire the
lock even if the rwsem is free if the following sequence happens:

   Non-first waiter       First waiter      Lock holder
   ----------------       ------------      -----------
   Acquire wait_lock
   rwsem_try_write_lock():
     Set handoff bit if RT or
       wait too long
     Set waiter->handoff_set
   Release wait_lock
                          Acquire wait_lock
                          Inherit waiter->handoff_set
                          Release wait_lock
                       Clear owner
                                            Release lock
   if (waiter.handoff_set) {
     rwsem_spin_on_owner(();
     if (OWNER_NULL)
       goto trylock_again;
   }
   trylock_again:
   Acquire wait_lock
   rwsem_try_write_lock():
      if (first->handoff_set && (waiter != first))
          return false;
   Release wait_lock

It is especially problematic if the non-first waiter is an RT task and
it is running on the same CPU as the first waiter as this can lead to
live lock.

Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
  kernel/locking/rwsem.c | 13 ++++++++++---
  1 file changed, 10 insertions(+), 3 deletions(-)

Mukesh, can you test if this patch can fix the RT task lockup problem?


Looks like, There is still a window for a race.

There is a chance when a reader who came first added it's BIAS and goes to slowpath and before it gets added to wait list it got preempted by RT task which  goes to slowpath as well and being the first waiter gets its hand-off bit set and not able to get the lock due to following condition in rwsem_try_write_lock()

 630                 if (count & RWSEM_LOCK_MASK) {  ==> reader has sets its bias
..
...

 634
 635                         new |= RWSEM_FLAG_HANDOFF;
 636                 } else {
 637                         new |= RWSEM_WRITER_LOCKED;


---------------------->----------------------->-------------------------

First reader (1)          writer(2) RT task             Lock holder(3)

It sets
RWSEM_READER_BIAS.
while it is going to
slowpath(as the lock
was held by (3)) and
before it got added
to the waiters list
it got preempted
by (2).
                        RT task also takes
                        the slowpath and add              release the
                        itself into waiting list          rwsem lock
            and since it is the first         clear the
                        it is the next one to get         owner.
                        the lock but it can not
                        get the lock as (count &
                        RWSEM_LOCK_MASK) is set
                        as (1) has added it but
                        not able to remove its
            adjustment.

Good catch!

It is indeed a possible livelock scenario. Will update the patch to address that.

Thanks,
Longman