Re: [PATCH v2] bpf: Fix use-after-free in __bpf_trace_run()
From: Qing Wang
Date: Thu Mar 05 2026 - 02:19:38 EST
On Thu, 05 Mar 2026 at 09:38, Jordan Rife <jrife@xxxxxxxxxx> wrote:
> > A use-after-free issue reported from syzbot exists in __bpf_trace_run().
> >
> > BUG: KASAN: slab-use-after-free in __bpf_trace_run kernel/trace/bpf_trace.c:2075 [inline]
> > -> struct bpf_prog *prog = link->link.prog;
> >
> > The link(struct bpf_raw_tp_link) was freed before accessing
> > link->link.prog.
> >
> > The root cause is that: When bpf_probe_unregister() is called, tasks may
> > have already entered the old tp_probes array (RCU read-side section)
> > before rcu_assign_pointer() updates tp->funcs. These tasks can access the
> > link through the old array. Without synchronization, the link can be freed
> > via call_rcu() after bpf_probe_unregister() in bpf_link_free(), leading to
> > use-after-free in __bpf_trace_run().
> >
> > CPU 0 (free link) CPU 1 (enter old tp probe)
> > ───────────────── ────────────────────────
> >
> > rcu_read_lock()
> > old_funcs = tp->funcs
> > bpf_raw_tp_link_release()
> > bpf_probe_unregister()
> > rcu_assign_pointer(tp->funcs, new)
> > call_srcu/call_rcu_tasks_trace(old_tp)
> > ...
> > call_rcu/call_rcu_tasks_trace(&link->rcu, ...)
>
> If CPU 1 is in an RCU read-side section, then call_rcu would wait for
> the RCU GP anyway before freeing the link in question.
Sry, It's my mistake that it should be 'srcu_read_lock(&tracepoint_srcu)'[0]
but not rcu_read_lock(), so that misleaded you. It only wait for the srcu
grace period (tracepoint).
[0]
include/linux/tracepoint.h:279
#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \
__DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), PARAMS(data_proto)) \
static inline void __do_trace_##name(proto) \
{ \
TRACEPOINT_CHECK(name) \
if (cond) { \
guard(srcu_fast_notrace)(&tracepoint_srcu); \ <----
__DO_TRACE_CALL(name, TP_ARGS(args)); \
} \
}
> > (RCU grace period)
> > kfree(link)
> > __bpf_trace_run(link, ...)
> > access link->link.prog
> > UAF!
> >
> > Fix by calling tracepoint_synchronize_unregister() to ensure all
> > in-flight tracepoint callbacks have completed, so the link is no
> > longer reachable before it is freed.
>
> It looks like tracepoint_synchronize_unregister() just calls
> synchronize_rcu_tasks_trace() and synchronize_rcu(), but it should also
> be sufficient to use call_rcu() or call_rcu_tasks_trace() to ensure that
> the appopriate grace period elapses for that tracepoint. Is the extra
> delay just masking the problem instead of fixing the root cause?
I think using synchronize_srcu(&tracepoint_srcu) is enough to ensure those
used old tp_probes can exit srcu_read_lock() before kfree(link). It needs
further discussion whether to use tracepoint_synchronize_unregister().
> > The issue was introduced by commit d4dfc5700e86 ("bpf:
> > pass whole link instead of prog when triggering raw tracepoint"),
> > which changed tracepoint callbacks to receive bpf_raw_tp_link pointers
> > instead of bpf_prog pointers.
>
> Did you run a bisect?
I'm trying to run it, but I haven't reproduced it yet.
--
Qing