Re: [PATCH bpf-next v5 3/3] bpf: Cache build IDs in sleepable stackmap path

From: Andrii Nakryiko

Date: Fri May 15 2026 - 22:20:50 EST


On Fri, May 15, 2026 at 4:59 PM Ihor Solodrai <ihor.solodrai@xxxxxxxxx> wrote:
>
> On 5/15/26 3:40 PM, Andrii Nakryiko wrote:
> > On Thu, May 14, 2026 at 5:53 PM Ihor Solodrai <ihor.solodrai@xxxxxxxxx> wrote:
> >>
> >> Stack traces often contain adjacent IPs from the same VMA or from
> >> different VMAs backed by the same ELF file. Cache the last successfully
> >> parsed build ID together with the resolved VMA range and backing file
> >> so the sleepable build-ID path can avoid repeated VMA locking and file
> >> parsing in common cases.
> >>
> >> Suggested-by: Mykyta Yatsenko <yatsenko@xxxxxxxx>
> >> Acked-by: Mykyta Yatsenko <yatsenko@xxxxxxxx>
> >> Signed-off-by: Ihor Solodrai <ihor.solodrai@xxxxxxxxx>
> >> ---
> >> kernel/bpf/stackmap.c | 56 ++++++++++++++++++++++++++++++++++++++++---
> >> 1 file changed, 53 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> >> index 08f7659505d1..7336fd55c856 100644
> >> --- a/kernel/bpf/stackmap.c
> >> +++ b/kernel/bpf/stackmap.c
> >> @@ -230,13 +230,33 @@ static void stack_map_get_build_id_offset_sleepable(struct bpf_stack_build_id *i
> >> .vma = NULL,
> >> .mm = mm,
> >> };
> >> - unsigned long vm_pgoff, vm_start;
> >> + struct {
> >> + struct file *file;
> >> + const unsigned char *build_id;
> >> + unsigned long vm_start;
> >> + unsigned long vm_end;
> >> + unsigned long vm_pgoff;
> >> + } cache = {};
> >> + unsigned long vm_pgoff, vm_start, vm_end;
> >> struct vm_area_struct *vma;
> >> struct file *file;
> >> u64 ip;
> >>
> >> for (u32 i = 0; i < trace_nr; i++) {
> >> ip = READ_ONCE(id_offs[i].ip);
> >> +
> >> + /*
> >> + * Range cache fast path: if ip falls within the previously
> >> + * resolved VMA range, reuse the cache build_id without
> >> + * re-acquiring the VMA lock.
> >> + */
> >> + if (cache.build_id && ip >= cache.vm_start && ip < cache.vm_end) {
> >> + vm_start = cache.vm_start;
> >> + vm_end = cache.vm_end;
> >> + vm_pgoff = cache.vm_pgoff;
> >
> > isn't this unnecessarily convoluted way of doing things? If we are
> > here, we know for sure a) how to calculate offset and b) we need to
> > memcpy. So just do that here and just then maybe jump all the way to
> > id_offs[i].offset setting. Or just fill offset, status and memcpy
> > build id here and continue.
> >
> >> + goto build_id_valid;
> >> + }
> >> +
> >> vma = stack_map_lock_vma(&lock, ip);
> >> if (!vma || !vma->vm_file) {
> >> stack_map_build_id_set_ip(&id_offs[i]);
> >> @@ -244,9 +264,21 @@ static void stack_map_get_build_id_offset_sleepable(struct bpf_stack_build_id *i
> >> continue;
> >> }
> >>
> >> - file = get_file(vma->vm_file);
> >> + file = vma->vm_file;
> >> vm_pgoff = vma->vm_pgoff;
> >> vm_start = vma->vm_start;
> >> + vm_end = vma->vm_end;
> >> +
> >> + if (file == cache.file) {
> >> + /*
> >> + * Same backing file as previous (e.g. different VMAs
> >> + * of the same ELF binary). Reuse the cache build_id.
> >> + */
> >> + stack_map_unlock_vma(&lock);
> >> + goto build_id_valid;
> >> + }
> >
> > same thing here, file matched, so memcpy(build_id), calculate correct
> > offset, set offset and status, unlock and continue. This logic is
> > pretty straightforward, IMO it's cleaner to copy-paste these 4 lines
> > than dealing with the goto maze...
>
>
> Eeh... it's a style preference I guess. I had two memcpy() points
> originally, and then Mykyta suggested to consolidate them, so I did.
>
> Having a single "build_id_valid" point with a common "ok" sequence
> makes sense to me, but at the same time I am not allergic to 4-line
> copy pastes.
>
> AFAIU with your suggestion we'll still have a label at
> id_offs[i].offset = ... so it's not like we'll get rid of goto right?

I'd forego goto altogether. You extracted stack_map_build_id_set_ip()
for absolute IP case, why not add stack_map_build_id_set_offset() (or
something a bit more reasonably named):

void stack_map_build_id_set_offset(struct bpf_stack_build_id *dst,
unsigned long off, void *build_id)
{
memcpy(dst->build_id, build_id, BUILD_ID_SIZE_MAX);
dst->offset = off;
dst->status = BPF_STACK_BUILD_ID_VALID;
}

and then `stack_map_build_id_set_offset() + continue` for each of
three possible build id cases

>
> >
> >
> >> +
> >> + file = get_file(file);
> >> stack_map_unlock_vma(&lock);
> >>
> >> /* build_id_parse_file() may block on filesystem reads */
> >> @@ -255,11 +287,29 @@ static void stack_map_get_build_id_offset_sleepable(struct bpf_stack_build_id *i
> >> fput(file);
> >> continue;
> >> }
> >> - fput(file);
> >>
> >> + if (cache.file)
> >> + fput(cache.file);
> >> + cache.file = file;
> >> + cache.build_id = id_offs[i].build_id;
> >> +
> >> +build_id_valid:
> >> + /*
> >> + * In the slow path cache.build_id points to id_offs[i].build_id.
> >> + * Cache hits leave cache.build_id pointing at a prior slot,
> >> + * triggering the memcpy here.
> >> + */
> >> + if (cache.build_id != id_offs[i].build_id)
> >> + memcpy(id_offs[i].build_id, cache.build_id, BUILD_ID_SIZE_MAX);
> >> + cache.vm_start = vm_start;
> >> + cache.vm_end = vm_end;
> >> + cache.vm_pgoff = vm_pgoff;
> >> id_offs[i].offset = (vm_pgoff << PAGE_SHIFT) + ip - vm_start;
> >> id_offs[i].status = BPF_STACK_BUILD_ID_VALID;
> >> }
> >> +
> >> + if (cache.file)
> >> + fput(cache.file);
> >> }
> >>
> >> /*
> >> --
> >> 2.54.0
> >>
>