Re: [syzbot] [trace?] [bpf?] KASAN: slab-use-after-free Read in bpf_trace_run2 (2)

From: Mathieu Desnoyers
Date: Tue Oct 22 2024 - 11:21:16 EST

Next message: syzbot: "Re: [syzbot] KMSAN: uninit-value in __hfs_ext_cache_extent (2)"
Previous message: AngeloGioacchino Del Regno: "Re: [PATCH 4/6] mfd: mt6397: Add initial support for MT6328"
In reply to: Steven Rostedt: "Re: [syzbot] [trace?] [bpf?] KASAN: slab-use-after-free Read in bpf_trace_run2 (2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2024-10-22 04:20, Steven Rostedt wrote:

Mathieu, can you look at this?

Sure,

[ more below ]

On Mon, 21 Oct 2024 18:23:47 +0000
Jordan Rife <jrife@xxxxxxxxxx> wrote:

I performed a bisection and this issue starts with commit a363d27cdbc2
("tracing: Allow system call tracepoints to handle page faults") which
introduces this change.

+ *
+ * With @syscall=0, the tracepoint callback array dereference is
+ * protected by disabling preemption.
+ * With @syscall=1, the tracepoint callback array dereference is
+ * protected by Tasks Trace RCU, which allows probes to handle page
+ * faults.
*/
#define __DO_TRACE(name, args, cond, syscall) \
do { \
@@ -204,11 +212,17 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
if (!(cond)) \
return; \
\
- preempt_disable_notrace(); \
+ if (syscall) \
+ rcu_read_lock_trace(); \
+ else \
+ preempt_disable_notrace(); \
\
__DO_TRACE_CALL(name, TP_ARGS(args)); \
\
- preempt_enable_notrace(); \
+ if (syscall) \
+ rcu_read_unlock_trace(); \
+ else \
+ preempt_enable_notrace(); \
} while (0)

Link: https://lore.kernel.org/bpf/20241009010718.2050182-6-mathieu.desnoyers@xxxxxxxxxxxx/

I reproduced the bug locally by running syz-execprog inside a QEMU VM.

./syz-execprog -repeat=0 -procs=5 ./repro.syz.txt

I /think/ what is happening is that with this change preemption may now
occur leading to a scenario where the RCU grace period is insufficient
in a few places where call_rcu() is used. In other words, there are a
few scenarios where call_rcu_tasks_trace() should be used instead to
prevent a use-after-free bug when a preempted tracepoint call tries to
access a program, link, etc. that was freed. It seems the syzkaller
program induces page faults while attaching raw tracepoints to
sys_enter making preemption more likely to occur.

kernel/tracepoint.c
===================

...
static inline void release_probes(struct tracepoint_func *old)
{
...
call_rcu(&tp_probes->rcu, rcu_free_old_probes); <-- Here

Have you tried just changing this one to call_rcu_tasks_trace()?

Actually, I see two possible solutions there:

1) If we want to keep unchanged register/unregister functions as a single
API for normal and syscall tracepoints, and we don't want to add additional
flags in the tracepoint struct, then we need to chain the call_rcu:

call_rcu() -> rcu_free_old_probes -> call_rcu_tasks_trace() -> rcu_tasks_trace_free_old_probes.

Just like we did when we used SRCU.

This is not perfect because we'd be adding extra delay for reclaim of
the old probes array for every tracepoint being updated, not just the
syscall tracepoints, but we probably don't care. This is a straightforward
initial solution.

We did something similar with SRCU before and it was OK.

2) If we want something more elegant, we can add a flag to struct tracepoint
for syscall tracepoints, and use that flag to choose between call_rcu()
and call_rcu_tasks_trace(). But maybe this additional complexity is not
even useful.

I'll prepare a patch implementing (1) and send it your way as RFC for further
testing.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Next message: syzbot: "Re: [syzbot] KMSAN: uninit-value in __hfs_ext_cache_extent (2)"
Previous message: AngeloGioacchino Del Regno: "Re: [PATCH 4/6] mfd: mt6397: Add initial support for MT6328"
In reply to: Steven Rostedt: "Re: [syzbot] [trace?] [bpf?] KASAN: slab-use-after-free Read in bpf_trace_run2 (2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]