Re: [PATCH] locking/semaphore: Use raw_spin_trylock_irqsave() in down_trylock()

From: Waiman Long
Date: Tue Jan 21 2025 - 18:10:22 EST



On 1/21/25 3:08 AM, Peter Zijlstra wrote:
On Mon, Jan 20, 2025 at 02:36:08PM -0500, Waiman Long wrote:
A circular lock dependency splat has been seen with down_trylock().

[ 4011.795602] ======================================================
[ 4011.795603] WARNING: possible circular locking dependency detected
[ 4011.795607] 6.12.0-41.el10.s390x+debug
[ 4011.795612] ------------------------------------------------------
[ 4011.795613] dd/32479 is trying to acquire lock:
[ 4011.795617] 0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90
[ 4011.795636]
[ 4011.795636] but task is already holding lock:
[ 4011.795637] 000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0
[ 4011.795644]
[ 4011.795644] which lock already depends on the new lock.
:
[ 4011.796025] (console_sem).lock --> hrtimer_bases.lock --> &zone->lock
[ 4011.796025]
[ 4011.796029] Possible unsafe locking scenario:
[ 4011.796029]
[ 4011.796030] CPU0
[ 4011.796031] ----
[ 4011.796032] lock(&zone->lock);
[ 4011.796034] lock(hrtimer_bases.lock);
[ 4011.796036] lock(&zone->lock);
[ 4011.796038] lock((console_sem).lock);
[ 4011.796039]
[ 4011.796039] *** DEADLOCK ***
Urgh, I hate this ^ bit of the lockdep output, it pretends to be
something useful, while it is the least useful part. Doubly so for
anything with more than 2 locks involved.

Sorry for not including other relevant information.

               the existing dependency chain (in reverse order) is:
               -> #4 (&zone->lock){-.-.}-{2:2}:
               -> #3 (hrtimer_bases.lock){-.-.}-{2:2}:
               -> #2 (&rq->__lock){-.-.}-{2:2}:
               -> #1 (&p->pi_lock){-.-.}-{2:2}:
               -> #0 ((console_sem).lock){-.-.}-{2:2}:
The last one is actually due to calling try_to_wake_up() while holding the console.sem raw_spinlock. Another way to break the circular locking dependency is to use the wake_q in the semaphore. I had proposed that in the past, but you said that the problem might be gone with console rewrite. Now the console rewrite should have been done with the v6.12 kernel, the problem is still there.

The calling sequence was
rmqueue_pcplist()
=> __rmqueue_pcplist()
=> rmqueue_bulk()
=> expand()
=> __add_to_free_list()
=> __warn_printk()
=> ...
=> console_trylock()
=> __down_trylock_console_sem()
=> down_trylock()
=> _raw_spin_lock_irqsave()

Normally, a trylock call should avoid this kind of circular lock
dependency splat, but down_trylock() is an exception. Fix this problem
by using raw_spin_trylock_irqsave() in down_trylock() to make it behave
like the other trylock calls.

Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
kernel/locking/semaphore.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c
index 34bfae72f295..cb27cbf5162f 100644
--- a/kernel/locking/semaphore.c
+++ b/kernel/locking/semaphore.c
@@ -127,6 +127,7 @@ EXPORT_SYMBOL(down_killable);
*
* NOTE: This return value is inverted from both spin_trylock and
* mutex_trylock! Be careful about this when converting code.
+ * I.e. 0 on success, 1 on failure.
*
* Unlike mutex_trylock, this function can be used from interrupt context,
* and the semaphore can be released by any task or interrupt.
@@ -136,7 +137,8 @@ int __sched down_trylock(struct semaphore *sem)
unsigned long flags;
int count;
- raw_spin_lock_irqsave(&sem->lock, flags);
+ if (!raw_spin_trylock_irqsave(&sem->lock, flags))
+ return 1;
count = sem->count - 1;
if (likely(count >= 0))
sem->count = count;
Urgh, this is terrible *again*. Didn't you try and do something
similarly daft with the rt_mutex_trylock ? And you didn't learn from
that?

I also a bit of concern as it is a change in behavior which  may have unintended consequence. I think using the wake_q in semaphore will be a less risky choice. What do you think?

Cheers,
Longman