Re: [PATCH] sched: fix rq lock recursion issue
From: Qais Yousef
Date: Thu Jun 30 2022 - 17:53:24 EST
Hi Satya
On 06/24/22 00:42, Satya Durga Srinivasu Prabhala wrote:
> Below recursion is observed in a rare scenario where __schedule()
> takes rq lock, at around same time task's affinity is being changed,
> bpf function for tracing sched_switch calls migrate_enabled(),
> checks for affinity change (cpus_ptr != cpus_mask) lands into
> __set_cpus_allowed_ptr which tries acquire rq lock and causing the
> recursion bug.
>
> Fix the issue by switching to preempt_enable/disable() for non-RT
> Kernels.
Interesting bug. Thanks for the report. Unfortunately I can't see this being
a fix as it just limits the bug visibility to PREEMPT_RT kernels, but won't fix
anything, no? ie: Kernels compiled with PREEMPT_RT will still hit this failure.
I'm curious how the race with set affinity is happening. I would have thought
user space would get blocked as __schedule() will hold the rq lock.
Do you have more details on that?
Thanks
--
Qais Yousef
>
> -010 |spin_bug(lock = ???, msg = ???)
> -011 |debug_spin_lock_before(inline)
> -011 |do_raw_spin_lock(lock = 0xFFFFFF89323BB600)
> -012 |_raw_spin_lock(inline)
> -012 |raw_spin_rq_lock_nested(inline)
> -012 |raw_spin_rq_lock(inline)
> -012 |task_rq_lock(p = 0xFFFFFF88CFF1DA00, rf = 0xFFFFFFC03707BBE8)
> -013 |__set_cpus_allowed_ptr(inline)
> -013 |migrate_enable()
> -014 |trace_call_bpf(call = ?, ctx = 0xFFFFFFFDEF954600)
> -015 |perf_trace_run_bpf_submit(inline)
> -015 |perf_trace_sched_switch(__data = 0xFFFFFFE82CF0BCB8, preempt = FALSE, prev = ?, next = ?)
> -016 |__traceiter_sched_switch(inline)
> -016 |trace_sched_switch(inline)
> -016 |__schedule(sched_mode = ?)
> -017 |schedule()
> -018 |arch_local_save_flags(inline)
> -018 |arch_irqs_disabled(inline)
> -018 |__raw_spin_lock_irq(inline)
> -018 |_raw_spin_lock_irq(inline)
> -018 |worker_thread(__worker = 0xFFFFFF88CE251300)
> -019 |kthread(_create = 0xFFFFFF88730A5A80)
> -020 |ret_from_fork(asm)
>
> Signed-off-by: Satya Durga Srinivasu Prabhala <quic_satyap@xxxxxxxxxxx>
> ---
> kernel/sched/core.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index bfa7452ca92e..e254e9227341 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2223,6 +2223,7 @@ static void migrate_disable_switch(struct rq *rq, struct task_struct *p)
>
> void migrate_disable(void)
> {
> +#ifdef CONFIG_PREEMPT_RT
> struct task_struct *p = current;
>
> if (p->migration_disabled) {
> @@ -2234,11 +2235,15 @@ void migrate_disable(void)
> this_rq()->nr_pinned++;
> p->migration_disabled = 1;
> preempt_enable();
> +#else
> + preempt_disable();
> +#endif
> }
> EXPORT_SYMBOL_GPL(migrate_disable);
>
> void migrate_enable(void)
> {
> +#ifdef CONFIG_PREEMPT_RT
> struct task_struct *p = current;
>
> if (p->migration_disabled > 1) {
> @@ -2265,6 +2270,9 @@ void migrate_enable(void)
> p->migration_disabled = 0;
> this_rq()->nr_pinned--;
> preempt_enable();
> +#else
> + preempt_enable();
> +#endif
> }
> EXPORT_SYMBOL_GPL(migrate_enable);
>
> --
> 2.36.1
>