Re: [BUG]locking/rwsem: only clean RWSEM_FLAG_HANDOFF when already set

From: Waiman Long
Date: Sun Nov 07 2021 - 10:24:48 EST

Next message: Arnaldo Carvalho de Melo: "Re: [PATCH v2] perf build: Install libbpf headers locally when building"
Previous message: kernel test robot: "drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dpcs/dpcs_4_2_0_offset.h:11936:111: error: initialized field overwritten"
In reply to: Waiman Long: "Re: [BUG]locking/rwsem: only clean RWSEM_FLAG_HANDOFF when already set"
Next in thread: Waiman Long: "Re: [BUG]locking/rwsem: only clean RWSEM_FLAG_HANDOFF when already set"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/7/21 04:01, Hillf Danton wrote:

On Sat, 6 Nov 2021 23:25:38 -0400 Waiman Long wrote:

On 11/6/21 08:39, 马振华 wrote:

Dear longman,

recently , i find a issue which rwsem count is negative value, it
happened always when a task try to get the lock
with __down_write_killable , then it is killed

this issue happened like this

CPU2 CPU4
task A[reader] task B[writer]
down_read_killable[locked]
sem->count=0x100
down_write_killable
sem->count=0x102[wlist not empty]
up_read
count=0x2
sig kill received
down_read_killable
sem->count=0x102[wlist not empty]
goto branch out_nolock:
list_del(&waiter.list);
wait list is empty
sem->count-RWSEM_FLAG_HANDOFF
sem->count=0xFE
list_empty(&sem->wait_list) is TRUE
sem->count andnot RWSEM_FLAG_WAITERS
sem->count=0xFC
up_read
sem->count -= 0x100
sem->count=0xFFFFFFFFFFFFFFFC
DEBUG_RWSEMS_WARN_ON(tmp < 0, sem);

so sem->count will be negative after writer is killed
i think if flag RWSEM_FLAG_HANDOFF is not set, we shouldn't clean it

Thanks for reporting this possible race condition.

However, I am still trying to figure how it is possible to set the
wstate to WRITER_HANDOFF without actually setting the handoff bit as
well. The statement sequence should be as follows:

wstate = WRITER_HANDOFF;
raw_spin_lock_irq(&sem->wait_lock);
if (rwsem_try_write_lock(sem, wstate))
raw_spin_unlock_irq(&sem->wait_lock);
:
if (signal_pending_state(state, current))
goto out_nolock

The rwsem_try_write_lock() function will make sure that we either
acquire the lock and clear handoff or set the handoff bit. This should
be done before we actually check for signal. I do think that it is

Given that WRITER_HANDOFF makes no sure that RWSEM_FLAG_HANDOFF is set in
wsem_try_write_lock(), the flag should be cleared only by the setter to
avoid count underflow.

Hillf

probably safer to use atomic_long_andnot to clear the handoff bit
instead of using atomic_long_add().

I did have a tentative patch to address this issue which is somewhat similar to your approach. However, I would like to further investigate the exact mechanics of the race condition to make sure that I won't miss a latent bug somewhere else in the rwsem code.

-Longman

diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index c51387a43265..20be967620c0 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -347,7 +347,8 @@ enum rwsem_wake_type {
enum writer_wait_state {
     WRITER_NOT_FIRST,    /* Writer is not first in wait list */
     WRITER_FIRST,        /* Writer is first in wait list     */
-    WRITER_HANDOFF        /* Writer is first & handoff needed */
+    WRITER_NEED_HANDOFF    /* Writer is first & handoff needed */
+    WRITER_HANDOFF        /* Writer is first & handoff set */
};

/*
@@ -532,11 +533,11 @@ static void rwsem_mark_wake(struct rw_semaphore *sem,
* race conditions between checking the rwsem wait list and setting the
* sem->count accordingly.
*
- * If wstate is WRITER_HANDOFF, it will make sure that either the handoff
+ * If wstate is WRITER_NEED_HANDOFF, it will make sure that either the handoff
* bit is set or the lock is acquired with handoff bit cleared.
*/
static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
-                    enum writer_wait_state wstate)
+                    enum writer_wait_state *wstate)
{
     long count, new;

@@ -546,13 +547,13 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
     do {
         bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF);

-        if (has_handoff && wstate == WRITER_NOT_FIRST)
+        if (has_handoff && *wstate == WRITER_NOT_FIRST)
             return false;

         new = count;

         if (count & RWSEM_LOCK_MASK) {
-            if (has_handoff || (wstate != WRITER_HANDOFF))
+            if (has_handoff || (*wstate != WRITER_NEED_HANDOFF))
                 return false;

             new |= RWSEM_FLAG_HANDOFF;
@@ -569,8 +570,10 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
     * We have either acquired the lock with handoff bit cleared or
     * set the handoff bit.
     */
-    if (new & RWSEM_FLAG_HANDOFF)
+    if (new & RWSEM_FLAG_HANDOFF) {
+        *wstate = WRITER_HANDOFF;
         return false;
+    }

     rwsem_set_owner(sem);
     return true;
@@ -1083,7 +1086,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
     /* wait until we successfully acquire the lock */
     set_current_state(state);
     for (;;) {
-        if (rwsem_try_write_lock(sem, wstate)) {
+        if (rwsem_try_write_lock(sem, &wstate)) {
             /* rwsem_try_write_lock() implies ACQUIRE on success */
             break;
         }
@@ -1138,7 +1141,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
             */
             if ((wstate == WRITER_FIRST) && (rt_task(current) ||
                 time_after(jiffies, waiter.timeout))) {
-                wstate = WRITER_HANDOFF;
+                wstate = WRITER_NEED_HANDOFF;
                 lockevent_inc(rwsem_wlock_handoff);
                 break;
             }
@@ -1159,7 +1162,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
     list_del(&waiter.list);

     if (unlikely(wstate == WRITER_HANDOFF))
-        atomic_long_add(-RWSEM_FLAG_HANDOFF, &sem->count);
+        atomic_long_addnot(RWSEM_FLAG_HANDOFF, &sem->count);

     if (list_empty(&sem->wait_list))
         atomic_long_andnot(RWSEM_FLAG_WAITERS, &sem->count);
--

Next message: Arnaldo Carvalho de Melo: "Re: [PATCH v2] perf build: Install libbpf headers locally when building"
Previous message: kernel test robot: "drivers/gpu/drm/amd/amdgpu/../include/asic_reg/dpcs/dpcs_4_2_0_offset.h:11936:111: error: initialized field overwritten"
In reply to: Waiman Long: "Re: [BUG]locking/rwsem: only clean RWSEM_FLAG_HANDOFF when already set"
Next in thread: Waiman Long: "Re: [BUG]locking/rwsem: only clean RWSEM_FLAG_HANDOFF when already set"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]