locking/rwsem: RT throttling issue due to RT task hogging the cpu

From: Mukesh Ojha
Date: Tue Sep 20 2022 - 12:21:35 EST


Hi,

We are observing one issue where, sem->owner is not set and sem->count=6 [1] which means both RWSEM_FLAG_WAITERS and RWSEM_FLAG_HANDOFF bits are set. And if unfold the sem->wait_list we see the following order of process waiting [2] where [a] is waiting for write, while [b],[c] are waiting for read and [d] is the RT task for which waiter.handoff_set=true and it is continuously running on cpu7 and not letting the first write waiter [a] on cpu7.

[1]

sem = 0xFFFFFFD57DDC6680 -> (
count = (counter = 6),
owner = (counter = 0),

[2]

[a] kworker/7:0 pid: 32516 ==> [b] iptables-restor pid: 18625 ==> [c]HwBinder:1544_3 pid: 2024 ==> [d] RenderEngine pid: 2032 cpu: 7 prio:97 (RT task)


Sometime back, Waiman has suggested this which could help in RT task
leaving the cpu.

https://lore.kernel.org/all/8c33f989-8870-08c6-db12-521de634b34e@xxxxxxxxxx/

--------------------------------->O----------------------------

From c6493edd7a5e4f597ea55ff0eb3f1d763b335dfc Mon Sep 17 00:00:00 2001
2 From: Waiman Long <longman@xxxxxxxxxx>
3 Date: Tue, 20 Sep 2022 20:50:45 +0530
4 Subject: [PATCH] locking/rwsem: Yield the cpu after doing handoff optimistic
5 spinning
6
7 It is possible the new lock owner (writer) can be preempted before setting
8 the owner field and if the current(e.g RT task) waiter is the task that
9 preempts the new lock owner, it will hand_off spin loop for a long time.
10 Avoid wasting cpu time and delaying the release of the lock by yielding
11 the cpu if handoff optimistic spinning has been done multiple times with
12 NULL owner.
13
14 Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
15 Signed-off-by: Mukesh Ojha <quic_mojha@xxxxxxxxxxx>
16 ---
17 kernel/locking/rwsem.c | 15 ++++++++++++++-
18 1 file changed, 14 insertions(+), 1 deletion(-)
19
20 diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
21 index 65f0262..a875758 100644
22 --- a/kernel/locking/rwsem.c
23 +++ b/kernel/locking/rwsem.c
24 @@ -361,6 +361,8 @@ enum rwsem_wake_type {
25 */
26 #define MAX_READERS_WAKEUP 0x100
27
28 +#define MAX_HANDOFF_SPIN 10
29 +
30 static inline void
31 rwsem_add_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter)
32 {
33 @@ -1109,6 +1111,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
34 {
35 struct rwsem_waiter waiter;
36 DEFINE_WAKE_Q(wake_q);
37 + int handoff_spins = 0;
38
39 /* do optimistic spinning and steal lock if possible */
40 if (rwsem_can_spin_on_owner(sem) && rwsem_optimistic_spin(sem)) {
41 @@ -1167,6 +1170,14 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
42 * has just released the lock, OWNER_NULL will be returned.
43 * In this case, we attempt to acquire the lock again
44 * without sleeping.
45 + *
46 + * It is possible the new lock owner (writer) can be preempted
47 + * before setting the owner field and if the current(e.g RT task)
48 + * waiter is the task that preempts the new lock owner, it will
49 + * spin in this loop for a long time. Avoid wasting cpu time
50 + * and delaying the release of the lock by yielding the cpu if
51 + * handoff optimistic spinning has been done multiple times with
52 + * NULL owner.
53 */
54 if (waiter.handoff_set) {
55 enum owner_state owner_state;
56 @@ -1175,8 +1186,10 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
57 owner_state = rwsem_spin_on_owner(sem);
58 preempt_enable();
59
60 - if (owner_state == OWNER_NULL)
61 + if ((owner_state == OWNER_NULL) && (handoff_spins < MAX_HANDOFF_SPIN)) {
62 + handoff_spins++;
63 goto trylock_again;
64 + }
65 }
66
67 schedule();
68 --
69 2.7.4
70


-Mukesh