Re: [PATCH] lockdep: Fix wait context check on softirq for PREEMPT_RT

From: Boqun Feng
Date: Tue Dec 03 2024 - 02:50:04 EST


On Mon, Dec 02, 2024 at 11:32:28AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 02, 2024 at 10:20:17AM +0900, Ryo Takakura wrote:
> > Commit 0c1d7a2c2d32 ("lockdep: Remove softirq accounting on
> > PREEMPT_RT.") stopped updating @softirq_context on PREEMPT_RT
> > to ignore "inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage"
> > as the report accounts softirq context which PREEMPT_RT doesn't
> > have to.
> >
> > However, wait context check still needs to report mutex usage
> > within softirq, even when its threaded on PREEMPT_RT. The check
> > is failing to report the usage as task_wait_context() checks if
> > its in softirq by referencing @softirq_context, ending up not
> > assigning the correct wait type of LD_WAIT_CONFIG for PREEMPT_RT's
> > softirq.
> >
> > [ 0.184549] | wait context tests |
> > [ 0.184549] --------------------------------------------------------------------------
> > [ 0.184549] | rcu | raw | spin |mutex |
> > [ 0.184549] --------------------------------------------------------------------------
> > [ 0.184550] in hardirq context: ok | ok | ok | ok |
> > [ 0.185083] in hardirq context (not threaded): ok | ok | ok | ok |
> > [ 0.185606] in softirq context: ok | ok | ok |FAILED|
> >
> > Account softirq context but only when !PREEMPT_RT so that
> > task_wait_context() returns LD_WAIT_CONFIG as intended.
> >
> > Signed-off-by: Ryo Takakura <ryotkkr98@xxxxxxxxx>
> >
> >
> > ---
> >
> > Hi!
> >
> > I wasn't able come up with a way to fix the wait context test while
> > keeping the commit 0c1d7a2c2d32 ("lockdep: Remove softirq accounting
> > on PREEMPT_RT.") without referencing @softirq_context...
> > Hoping to get a feedback on it!
> >
> > Also I wonder if the test can be skipped as I believe its taken care

Skipping the test would be awful because tests are supposed to catch
unexpected bugs :/

> > by spinlock wait context test since the PREEMPT_RT's softirq context is
> > protected by local_lock which is mapped to rt_spinlock.
>
> Right,.. so I remember talking about this with Boqun, and I think we
> were going to 'fix' the test, but I can't quite remember.
>
> Perhaps adding the local_lock to SOFTIRQ_ENTER?

So I took a look, SOFTIRQ_ENTER() already calls local_bh_disable(),
which is supposed to acquire a local_lock "softirq_ctrl.lock" (Ryo, I
believe this is the local_lock you mentioned above?) in normal cases.
However, if local_bh_disable() is called with preempt disabled, then no
local_lock will be acquired. For example, if you do:

preempt_disable();
local_bh_disable();
preempt_enable();
mutex_lock();

no local_lock will be acquired, therefore check_wait_context() will
report nothing. The fun part of "why this caused an issue in the lockdep
selftests?" is these tests are run with preempt_count() == 1 ;-) I guess
this is because we run these in early stage of kernel booting? Will take
a look tomorrow.

Maybe the right way to fix this is adding a conceptual local_lock for
BH disable like below.

Regards,
Boqun

------------------------->8
diff --git a/include/linux/bottom_half.h b/include/linux/bottom_half.h
index fc53e0ad56d9..d5b898588277 100644
--- a/include/linux/bottom_half.h
+++ b/include/linux/bottom_half.h
@@ -4,6 +4,7 @@

#include <linux/instruction_pointer.h>
#include <linux/preempt.h>
+#include <linux/lockdep.h>

#if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_TRACE_IRQFLAGS)
extern void __local_bh_disable_ip(unsigned long ip, unsigned int cnt);
@@ -15,9 +16,12 @@ static __always_inline void __local_bh_disable_ip(unsigned long ip, unsigned int
}
#endif

+extern struct lockdep_map bh_lock_map;
+
static inline void local_bh_disable(void)
{
__local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
+ lock_map_acquire(&bh_lock_map);
}

extern void _local_bh_enable(void);
@@ -25,6 +29,7 @@ extern void __local_bh_enable_ip(unsigned long ip, unsigned int cnt);

static inline void local_bh_enable_ip(unsigned long ip)
{
+ lock_map_release(&bh_lock_map);
__local_bh_enable_ip(ip, SOFTIRQ_DISABLE_OFFSET);
}

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 8b41bd13cc3d..17d9bf6e0caf 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -1066,3 +1066,13 @@ unsigned int __weak arch_dynirq_lower_bound(unsigned int from)
{
return from;
}
+
+static struct lock_class_key bh_lock_key;
+struct lockdep_map bh_lock_map = {
+ .name = "local_bh",
+ .key = &bh_lock_key,
+ .wait_type_outer = LD_WAIT_FREE,
+ .wait_type_inner = LD_WAIT_CONFIG, /* PREEMPT_RT makes BH preemptible. */
+ .lock_type = LD_LOCK_PERCPU,
+};
+EXPORT_SYMBOL_GPL(bh_lock_map);