Re: [PATCH] sched: fix rq lock recursion issue

From: Satya Durga Srinivasu Prabhala
Date: Tue Jul 05 2022 - 23:50:23 EST



On 6/30/22 3:37 PM, Steven Rostedt wrote:
On Fri, Jun 24, 2022 at 12:42:40AM -0700, Satya Durga Srinivasu Prabhala wrote:
Below recursion is observed in a rare scenario where __schedule()
takes rq lock, at around same time task's affinity is being changed,
bpf function for tracing sched_switch calls migrate_enabled(),
checks for affinity change (cpus_ptr != cpus_mask) lands into
__set_cpus_allowed_ptr which tries acquire rq lock and causing the
recursion bug.

Fix the issue by switching to preempt_enable/disable() for non-RT
Kernels.

-010 |spin_bug(lock = ???, msg = ???)
-011 |debug_spin_lock_before(inline)
-011 |do_raw_spin_lock(lock = 0xFFFFFF89323BB600)
-012 |_raw_spin_lock(inline)
-012 |raw_spin_rq_lock_nested(inline)
-012 |raw_spin_rq_lock(inline)
-012 |task_rq_lock(p = 0xFFFFFF88CFF1DA00, rf = 0xFFFFFFC03707BBE8)
-013 |__set_cpus_allowed_ptr(inline)
-013 |migrate_enable()
-014 |trace_call_bpf(call = ?, ctx = 0xFFFFFFFDEF954600)
-015 |perf_trace_run_bpf_submit(inline)
-015 |perf_trace_sched_switch(__data = 0xFFFFFFE82CF0BCB8, preempt = FALSE, prev = ?, next = ?)
-016 |__traceiter_sched_switch(inline)
-016 |trace_sched_switch(inline)
trace_sched_switch() disables preemption.

So how is this a fix?
Thanks for your time and comments.
I was more looking at non-RT Kernel where switching to preempt_disable/enable() helps as it's
just increment/decrement of count. I agree, this isn't a right fix.
I'm still cross checking on easy way to repro the issue. Will cross check further and get back.


-- Steve

-016 |__schedule(sched_mode = ?)
-017 |schedule()
-018 |arch_local_save_flags(inline)
-018 |arch_irqs_disabled(inline)
-018 |__raw_spin_lock_irq(inline)
-018 |_raw_spin_lock_irq(inline)
-018 |worker_thread(__worker = 0xFFFFFF88CE251300)
-019 |kthread(_create = 0xFFFFFF88730A5A80)
-020 |ret_from_fork(asm)

Signed-off-by: Satya Durga Srinivasu Prabhala <quic_satyap@xxxxxxxxxxx>
---
kernel/sched/core.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bfa7452ca92e..e254e9227341 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2223,6 +2223,7 @@ static void migrate_disable_switch(struct rq *rq, struct task_struct *p)
void migrate_disable(void)
{
+#ifdef CONFIG_PREEMPT_RT
struct task_struct *p = current;
if (p->migration_disabled) {
@@ -2234,11 +2235,15 @@ void migrate_disable(void)
this_rq()->nr_pinned++;
p->migration_disabled = 1;
preempt_enable();
+#else
+ preempt_disable();
+#endif
}
EXPORT_SYMBOL_GPL(migrate_disable);
void migrate_enable(void)
{
+#ifdef CONFIG_PREEMPT_RT
struct task_struct *p = current;
if (p->migration_disabled > 1) {
@@ -2265,6 +2270,9 @@ void migrate_enable(void)
p->migration_disabled = 0;
this_rq()->nr_pinned--;
preempt_enable();
+#else
+ preempt_enable();
+#endif
}
EXPORT_SYMBOL_GPL(migrate_enable);
--
2.36.1