Re: [BUG] possible deadlock in __schedule (with reproducer available)

From: Google
Date: Fri Nov 29 2024 - 03:36:11 EST


On Sat, 23 Nov 2024 03:39:45 +0000
Ruan Bonan <bonan.ruan@xxxxxxxxx> wrote:

>
> vprintk_emit+0x414/0xb90 kernel/printk/printk.c:2406
> _printk+0x7a/0xa0 kernel/printk/printk.c:2432
> fail_dump lib/fault-inject.c:46 [inline]
> should_fail_ex+0x3be/0x570 lib/fault-inject.c:154
> strncpy_from_user+0x36/0x230 lib/strncpy_from_user.c:118
> strncpy_from_user_nofault+0x71/0x140 mm/maccess.c:186
> bpf_probe_read_user_str_common kernel/trace/bpf_trace.c:215 [inline]
> ____bpf_probe_read_user_str kernel/trace/bpf_trace.c:224 [inline]

Hmm, this is a combination issue of BPF and fault injection.

static void fail_dump(struct fault_attr *attr)
{
if (attr->verbose > 0 && __ratelimit(&attr->ratelimit_state)) {
printk(KERN_NOTICE "FAULT_INJECTION: forcing a failure.\n"
"name %pd, interval %lu, probability %lu, "
"space %d, times %d\n", attr->dname,
attr->interval, attr->probability,
atomic_read(&attr->space),
atomic_read(&attr->times));

This printk() acquires console lock under rq->lock has been acquired.

This can happen if we use fault injection and trace event too because
the fault injection caused printk warning.
I think this should be a bug of the fault injection, not tracing/BPF.
And to solve this issue, we may be able to check the context and if
it is tracing/NMI etc, fault injection should NOT make it failure.

Thank you,

--
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>