Re: [PATCH] locking/rwbase: Fix race condition in rwlock_rt fast path

From: Waiman Long
Date: Tue Sep 12 2023 - 10:22:49 EST


On 9/11/23 08:07, Haodong Wong wrote:
When the writer in fast path check onwer before sub READER_BIAS,since
var readers=READER_BIAS, reader thread can jump to critical section to
access the data that modified by writer which may cause itself panic.

For example we observed below panic in SMP PREEMPT RT:

Call trace:

try_to_wake_up+0x374/0x5d0
default_wake_function+0x3c/0x70
autoremove_wake_function+0x3c/0x70
__wake_up_common+0x90/0x150
__wake_up_common_lock+0x7c/0xc0
__wake_up+0x40/0x50
ep_poll_callback+0xa0/0x260
__wake_up_common+0x90/0x150
__wake_up_common_lock+0x7c/0xc0
__wake_up_sync_key+0x40/0x50
sock_def_readable+0x44/0x90
tcp_data_ready+0x3c/0xf0
tcp_rcv_established+0x754/0x7f0
tcp_v4_do_rcv+0x1f0/0x2e0
tcp_v4_rcv+0xb60/0xc80
ip_protocol_deliver_rcu+0x44/0x200
ip_local_deliver_finish+0x68/0x80
ip_local_deliver+0x80/0x130
ip_sublist_rcv_finish+0x48/0x70
ip_sublist_rcv+0x178/0x1f0
ip_list_rcv+0x108/0x1b0
__netif_receive_skb_list_core+0x18c/0x220
netif_receive_skb_list_internal+0x18c/0x2a0
napi_complete_done+0xc4/0x210
ether_napi_poll_rx+0xa4/0xe0
net_rx_action+0x120/0x430
__do_softirq+0x120/0x3b4
__local_bh_enable_ip+0xd8/0x140
irq_forced_thread_fn+0x84/0xc0
irq_thread+0x150/0x260
kthread+0x17c/0x1a0
ret_from_fork+0x10/0x18

The process is as follows:
Writer Thread Reader Thread

ep_poll: ep_poll_callback:
write_lock_irq: read_lock_irq:

set rtmutex onwer in fast patch
if reader < 0
exit read_lock
__remove_wait_queue

After subtracting READER_BIAS, rwbase_write_lock() will wait until the reader count is 0 before proceeding. So I don't see an obvious race here since the reader count has to be non-zero when that reader enters its critical section.


_wake_up PANIC happen

Signed-off-by: Haodong Wong <haydenw.kernel@xxxxxxxxx>
---
kernel/locking/rtmutex_common.h | 15 +++++++++++++++
kernel/locking/rwbase_rt.c | 9 +++++++++
2 files changed, 24 insertions(+)

diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index 1162e07cdaea..d81042657ce2 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -161,6 +161,15 @@ static inline struct task_struct *rt_mutex_owner(struct rt_mutex_base *lock)
return (struct task_struct *) (owner & ~RT_MUTEX_HAS_WAITERS);
}
+static inline struct task_struct *rt_mutex_owner_acquire(struct rt_mutex_base *lock)
+{
+ /* Add rt mutex onwer acquire method */
+ unsigned long owner = (unsigned long) smp_load_acquire(&lock->owner);
+
+ return (struct task_struct *) (owner & ~RT_MUTEX_HAS_WAITERS);
+}
+
+
/*
* Constants for rt mutex functions which have a selectable deadlock
* detection.
@@ -229,6 +238,12 @@ static inline struct task_struct *rt_mutex_owner(struct rt_mutex_base *lock)
{
return NULL;
}
+
+static inline struct task_struct *rt_mutex_owner_acquire(struct rt_mutex_base *lock)
+{
+ return NULL;
+}
+
#endif /* !CONFIG_RT_MUTEXES */
#endif
diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index 25ec0239477c..ca271f57b9c1 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -54,6 +54,15 @@ static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
{
int r;
+ /*
+ * Reader should check if writer in fast patch in case of they
+ * have race condition in the first time.
+ */
+
+ if ((atomic_read(&lock->readers) == READER_BIAS) &&
+ rt_mutex_owner_acquire(&lock->rtmutex))
+ return 0;
+

Does this patch even compile? There is no variable lock in rwbase_read_trylock().

Cheers,
Longman