Re: [PATCH] bpf-next: Prevent out of bound buffer write in __bpf_get_stack
From: Andrii Nakryiko
Date: Mon Jan 05 2026 - 19:53:13 EST
On Sun, Jan 4, 2026 at 12:52 PM Arnaud Lecomte <contact@xxxxxxxxxxxxxx> wrote:
>
> Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stack()
> during stack trace copying.
>
> The issue occurs when: the callchain entry (stored as a per-cpu variable)
> grow between collection and buffer copy, causing it to exceed the initially
> calculated buffer size based on max_depth.
>
> The callchain collection intentionally avoids locking for performance
> reasons, but this creates a window where concurrent modifications can
> occur during the copy operation.
>
> To prevent this from happening, we clamp the trace len to the max
> depth initially calculated with the buffer size and the size of
> a trace.
>
> Reported-by: syzbot+d1b7fa1092def3628bd7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://lore.kernel.org/all/691231dc.a70a0220.22f260.0101.GAE@xxxxxxxxxx/T/
> Fixes: e17d62fedd10 ("bpf: Refactor stack map trace depth calculation into helper function")
> Tested-by: syzbot+d1b7fa1092def3628bd7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Cc: Brahmajit Das <listout@xxxxxxxxxxx>
> Signed-off-by: Arnaud Lecomte <contact@xxxxxxxxxxxxxx>
> ---
> Thanks Brahmajit Das for the initial fix he proposed that I tweaked
> with the correct justification and a better implementation in my
> opinion.
> ---
> kernel/bpf/stackmap.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index da3d328f5c15..e56752a9a891 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -465,7 +465,6 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
>
> if (trace_in) {
> trace = trace_in;
> - trace->nr = min_t(u32, trace->nr, max_depth);
> } else if (kernel && task) {
> trace = get_callchain_entry_for_task(task, max_depth);
> } else {
> @@ -479,7 +478,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
> goto err_fault;
> }
>
> - trace_nr = trace->nr - skip;
> + trace_nr = min(trace->nr, max_depth);
there is `trace->nr < skip` check right above, should it be moved here
and done against adjusted trace_nr (but before we subtract skip, of
course)?
> + trace_nr = trace_nr - skip;
> copy_len = trace_nr * elem_size;
>
> ips = trace->ip + skip;
> --
> 2.43.0
>