Mathieu, can you look at this?
[ more below ]
On Mon, 21 Oct 2024 18:23:47 +0000
Jordan Rife <jrife@xxxxxxxxxx> wrote:
I performed a bisection and this issue starts with commit a363d27cdbc2
("tracing: Allow system call tracepoints to handle page faults") which
introduces this change.
+ *
+ * With @syscall=0, the tracepoint callback array dereference is
+ * protected by disabling preemption.
+ * With @syscall=1, the tracepoint callback array dereference is
+ * protected by Tasks Trace RCU, which allows probes to handle page
+ * faults.
*/
#define __DO_TRACE(name, args, cond, syscall) \
do { \
@@ -204,11 +212,17 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
if (!(cond)) \
return; \
\
- preempt_disable_notrace(); \
+ if (syscall) \
+ rcu_read_lock_trace(); \
+ else \
+ preempt_disable_notrace(); \
\
__DO_TRACE_CALL(name, TP_ARGS(args)); \
\
- preempt_enable_notrace(); \
+ if (syscall) \
+ rcu_read_unlock_trace(); \
+ else \
+ preempt_enable_notrace(); \
} while (0)
Link: https://lore.kernel.org/bpf/20241009010718.2050182-6-mathieu.desnoyers@xxxxxxxxxxxx/
I reproduced the bug locally by running syz-execprog inside a QEMU VM.
./syz-execprog -repeat=0 -procs=5 ./repro.syz.txt
I /think/ what is happening is that with this change preemption may now
occur leading to a scenario where the RCU grace period is insufficient
in a few places where call_rcu() is used. In other words, there are a
few scenarios where call_rcu_tasks_trace() should be used instead to
prevent a use-after-free bug when a preempted tracepoint call tries to
access a program, link, etc. that was freed. It seems the syzkaller
program induces page faults while attaching raw tracepoints to
sys_enter making preemption more likely to occur.
kernel/tracepoint.c
===================
...
static inline void release_probes(struct tracepoint_func *old)
{
...
call_rcu(&tp_probes->rcu, rcu_free_old_probes); <-- Here
Have you tried just changing this one to call_rcu_tasks_trace()?