Re: [PATCH] lockdep: Fix wait context check on softirq for PREEMPT_RT

From: Ryo Takakura
Date: Mon Dec 09 2024 - 11:09:55 EST


Hi!

Sorry for being late on reply. I was trying to understand some
of the selftest reports that came across...

On Mon, 2 Dec 2024 23:49:24 -0800, Boqun Feng wrote:
>Maybe the right way to fix this is adding a conceptual local_lock for
>BH disable like below.
>
>Regards,
>Boqun
>
>------------------------->8
>diff --git a/include/linux/bottom_half.h b/include/linux/bottom_half.h
>index fc53e0ad56d9..d5b898588277 100644
>--- a/include/linux/bottom_half.h
>+++ b/include/linux/bottom_half.h
>@@ -4,6 +4,7 @@
>
> #include <linux/instruction_pointer.h>
> #include <linux/preempt.h>
>+#include <linux/lockdep.h>
>
> #if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_TRACE_IRQFLAGS)
> extern void __local_bh_disable_ip(unsigned long ip, unsigned int cnt);
>@@ -15,9 +16,12 @@ static __always_inline void __local_bh_disable_ip(unsigned long ip, unsigned int
> }
> #endif
>
>+extern struct lockdep_map bh_lock_map;
>+
> static inline void local_bh_disable(void)
> {
> __local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>+ lock_map_acquire(&bh_lock_map);
> }
>
> extern void _local_bh_enable(void);
>@@ -25,6 +29,7 @@ extern void __local_bh_enable_ip(unsigned long ip, unsigned int cnt);
>
> static inline void local_bh_enable_ip(unsigned long ip)
> {
>+ lock_map_release(&bh_lock_map);
> __local_bh_enable_ip(ip, SOFTIRQ_DISABLE_OFFSET);
> }

Maybe &bh_lock_map should be acquired at local_bh_enable()?

static inline void local_bh_enable(void)
{
+ lock_map_release(&bh_lock_map);
__local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
}

On !CONFIG_PREEMPT_RT, I noticed that softirq related selftests
(e.g. lock-inversion) fails with recursive locking error

[ 0.741422] ============================================
[ 0.741447] WARNING: possible recursive locking detected
[ 0.741471] 6.12.0-rc1-v8+ #25 Not tainted
[ 0.741495] --------------------------------------------
[ 0.741519] swapper/0/0 is trying to acquire lock:
[ 0.741544] ffffffecd02e0160 (local_bh){+.+.}-{1:3}, at: irq_inversion_soft_spin_123+0x0/0x178
[ 0.741621]
but task is already holding lock:
[ 0.741648] ffffffecd02e0160 (local_bh){+.+.}-{1:3}, at: irq_inversion_soft_spin_123+0x0/0x178
[ 0.741721]
other info that might help us debug this:
[ 0.741750] Possible unsafe locking scenario:

[ 0.741776] CPU0
[ 0.741793] ----
[ 0.741810] lock(local_bh);
[ 0.741840] lock(local_bh);
[ 0.741868]
*** DEADLOCK ***

where it does SOFTIRQ_ENTER()/EXIT() and SOFTIRQ_DISABLE()/ENABLE()
as each enables BH with local_bh_enable().

But I was little confused that isn't recursively disabling BH allowed,
especially if PREEMPT_RT doesn't disable preemption? (I was also
wondering if disabling BH recursively is something that can happen
on !PREEMPT_RT if it disables preemption...)
If so, wouldn't report for such case be false?

>diff --git a/kernel/softirq.c b/kernel/softirq.c
>index 8b41bd13cc3d..17d9bf6e0caf 100644
>--- a/kernel/softirq.c
>+++ b/kernel/softirq.c
>@@ -1066,3 +1066,13 @@ unsigned int __weak arch_dynirq_lower_bound(unsigned int from)
> {
> return from;
> }
>+
>+static struct lock_class_key bh_lock_key;
>+struct lockdep_map bh_lock_map = {
>+ .name = "local_bh",
>+ .key = &bh_lock_key,
>+ .wait_type_outer = LD_WAIT_FREE,
>+ .wait_type_inner = LD_WAIT_CONFIG, /* PREEMPT_RT makes BH preemptible. */
>+ .lock_type = LD_LOCK_PERCPU,
>+};
>+EXPORT_SYMBOL_GPL(bh_lock_map);

Sincerely,
Ryo Takakura