Re: [PATCH V10 RESEND 2/2] bpf: fix stackmap overflow check in __bpf_get_stackid()

From: Guenter Roeck

Date: Sun Mar 29 2026 - 19:51:42 EST


Hi,

On Sat, Oct 25, 2025 at 07:29:41PM +0000, Arnaud Lecomte wrote:
> Syzkaller reported a KASAN slab-out-of-bounds write in __bpf_get_stackid()
> when copying stack trace data. The issue occurs when the perf trace
> contains more stack entries than the stack map bucket can hold,
> leading to an out-of-bounds write in the bucket's data array.
>
> Reported-by: syzbot+c9b724fbb41cf2538b7b@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=c9b724fbb41cf2538b7b
> Fixes: ee2a098851bf ("bpf: Adjust BPF stack helper functions to accommodate skip > 0")
> Acked-by: Yonghong Song <yonghong.song@xxxxxxxxx>
> Acked-by: Song Liu <song@xxxxxxxxxx>
> Signed-off-by: Arnaud Lecomte <contact@xxxxxxxxxxxxxx>
> ---
> Changes in v2:
> - Fixed max_depth names across get stack id
>
> Changes in v4:
> - Removed unnecessary empty line in __bpf_get_stackid
>
> Changes in v6:
> - Added back trace_len computation in __bpf_get_stackid
>
> Changes in v7:
> - Removed usefull trace->nr assignation in bpf_get_stackid_pe
> - Added restoration of trace->nr for both kernel and user traces
> in bpf_get_stackid_pe
>
> Changes in v9:
> - Fixed variable declarations in bpf_get_stackid_pe
> - Added the missing truncate of trace_nr in __bpf_getstackid
>
> Changes in v10:
> - Remove not required trace->nr = nr_kernel; in bpf_get_stackid_pe
>
> Link to v9:
> https://lore.kernel.org/all/20250912233558.75076-1-contact@xxxxxxxxxxxxxx/
> ---
> ---
> kernel/bpf/stackmap.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 5e9ad050333c..2365541c81dd 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -251,8 +251,8 @@ static long __bpf_get_stackid(struct bpf_map *map,
> {
> struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map);
> struct stack_map_bucket *bucket, *new_bucket, *old_bucket;
> + u32 hash, id, trace_nr, trace_len, i, max_depth;
> u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
> - u32 hash, id, trace_nr, trace_len, i;
> bool user = flags & BPF_F_USER_STACK;
> u64 *ips;
> bool hash_matches;
> @@ -261,7 +261,8 @@ static long __bpf_get_stackid(struct bpf_map *map,
> /* skipping more than usable stack trace */
> return -EFAULT;
>
> - trace_nr = trace->nr - skip;
> + max_depth = stack_map_calculate_max_depth(map->value_size, stack_map_data_size(map), flags);
> + trace_nr = min_t(u32, trace->nr - skip, max_depth - skip);

I stumbled over this patch while researching a bpf related crash.
I assume that the problem it tries to solve is to prevent the
"skip > max_depth" condition.

>From the context, it is guaranteed that trace->nr > skip, so we know that
trace->nr - skip is positive. If skip <= max_depth, the above then
constraints trace_nr to the difference, and there is no problem. However,
if skip > max_depth, "max_depth - skip" will be a large positive number,
effectively making trace->nr - skip unrestricted.

Is the condition "max_depth >= skip" guaranteed somewhere ? skip itself
seems to be provided by userspace, and stack_map_calculate_max_depth()
returns at most curr_sysctl_max_stack. What happens if skip is larger
than curr_sysctl_max_stack ?

My apologies for the noise if I am completely missing the point.

Thanks,
Guenter