Re: [BUG] possible deadlock in __schedule (with reproducer available)
From: Steven Rostedt
Date: Sat Nov 23 2024 - 17:59:28 EST
On Sat, 23 Nov 2024 21:27:44 +0100
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Sat, Nov 23, 2024 at 03:39:45AM +0000, Ruan Bonan wrote:
>
> > </TASK>
> > FAULT_INJECTION: forcing a failure.
> > name fail_usercopy, interval 1, probability 0, space 0, times 0
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 6.12.0-rc7-00144-g66418447d27b #8 Not tainted
> > ------------------------------------------------------
> > syz-executor144/330 is trying to acquire lock:
> > ffffffffbcd2da38 ((console_sem).lock){....}-{2:2}, at: down_trylock+0x20/0xa0 kernel/locking/semaphore.c:139
> >
> > but task is already holding lock:
> > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested kernel/sched/core.c:598 [inline]
> > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock kernel/sched/sched.h:1506 [inline]
> > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1805 [inline]
> > ffff888065cbd718 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x140/0x1e70 kernel/sched/core.c:6592
> >
> > which lock already depends on the new lock.
> >
> > _printk+0x7a/0xa0 kernel/printk/printk.c:2432
> > fail_dump lib/fault-inject.c:46 [inline]
> > should_fail_ex+0x3be/0x570 lib/fault-inject.c:154
> > strncpy_from_user+0x36/0x230 lib/strncpy_from_user.c:118
> > strncpy_from_user_nofault+0x71/0x140 mm/maccess.c:186
> > bpf_probe_read_user_str_common kernel/trace/bpf_trace.c:215 [inline]
> > ____bpf_probe_read_user_str kernel/trace/bpf_trace.c:224 [inline]
> > bpf_probe_read_user_str+0x2a/0x70 kernel/trace/bpf_trace.c:221
> > bpf_prog_bc7c5c6b9645592f+0x3e/0x40
> > bpf_dispatcher_nop_func include/linux/bpf.h:1265 [inline]
> > __bpf_prog_run include/linux/filter.h:701 [inline]
> > bpf_prog_run include/linux/filter.h:708 [inline]
> > __bpf_trace_run kernel/trace/bpf_trace.c:2316 [inline]
> > bpf_trace_run4+0x30b/0x4d0 kernel/trace/bpf_trace.c:2359
> > __bpf_trace_sched_switch+0x1c6/0x2c0 include/trace/events/sched.h:222
> > trace_sched_switch+0x12a/0x190 include/trace/events/sched.h:222
>
> -EWONTFIX. Don't do stupid.
Ack. BPF should not be causing deadlocks by doing code called from
tracepoints. Tracepoints have a special context similar to NMIs. If you add
a hook into an NMI handler that causes a deadlock, it's a bug in the hook,
not the NMI code. If you add code that causes a deadlock when attaching to a
tracepoint, it's a bug in the hook, not the tracepoint.
-- Steve