Re: [RFC PATCH v3] bpf: Using binary search to improve the performance of btf_find_by_name_kind

From: Donglin Peng
Date: Sat Jun 15 2024 - 07:49:43 EST


On Tue, Jun 11, 2024 at 6:13 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Sat, 2024-06-08 at 07:08 -0700, Donglin Peng wrote:
>
> [...]
>
> > Changes in RFC v3:
> > - Sort the btf types during the build process in order to reduce memory usage
> > and decrease boot time.
> >
> > RFC v2:
> > - https://lore.kernel.org/all/20230909091646.420163-1-pengdonglin@xxxxxxxxxxxxxx
> > ---
> > include/linux/btf.h | 1 +
> > kernel/bpf/btf.c | 160 +++++++++++++++++++++++++++++++++---
>
> I think that kernel part is in a good shape,
> please split it as a separate commit.

Okay, thanks.

>
> > tools/lib/bpf/btf.c | 195 ++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 345 insertions(+), 11 deletions(-)
>
> [...]
>
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 2d0840ef599a..93c1ab677bfa 100644
>
> I'm not sure that libbpf is the best place to put this functionality,
> as there might be different kinds of orderings
> (e.g. see a fresh commit to bpftool to output stable vmlinux.h:
> 94133cf24bb3 "bpftool: Introduce btf c dump sorting").

Thanks, I think it would be better to put it into the libbpf. However, I would
also like to hear the opinions of others.

>
> I'm curious what Andrii, Alan and Arnaldo think on libbpf vs pahole
> for this feature.
>
> Also, I have a selftests build failure with this patch-set
> (and I suspect that a bunch of dedup test cases would need an update):

I appologize for the bug in my patch that caused the issue. I will fix it.

>
> $ pwd
> /home/eddy/work/bpf-next/tools/testing/selftests/bpf
> $ make -j14 test_progs
> ...
>
> GEN-SKEL [test_progs] access_map_in_map.skel.h
> Binary files /home/eddy/work/bpf-next/tools/testing/selftests/bpf/access_map_in_map.bpf.linked2.o and /home/eddy/work/bpf-next/tools/testing/selftests/bpf/access_map_in_map.bpf.linked3.o differ
> make: *** [Makefile:658: /home/eddy/work/bpf-next/tools/testing/selftests/bpf/access_map_in_map.skel.h] Error 1
> make: *** Waiting for unfinished jobs....

Sorry, I neglected to perform an ID remap for the btf_types in the BTF.ext
section. I will fix it.

>
> If this change remains in libbpf, I think it would be great to update
> btf_find_by_name_kind() to work the same way as kernel one.

Sounds good, we might do it later.

>
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
>
> [...]
>
> > +static int btf_sort_type_by_name(struct btf *btf)
> > +{
> > + struct btf_type *bt;
> > + __u32 *new_type_offs = NULL, *new_type_offs_noname = NULL;
> > + __u32 *maps = NULL, *found_offs;
> > + void *new_types_data = NULL, *loc_data;
> > + int i, j, k, type_cnt, ret = 0, type_size;
> > + __u32 data_size;
> > +
> > + if (btf_ensure_modifiable(btf))
> > + return libbpf_err(-ENOMEM);
> > +
> > + type_cnt = btf->nr_types;
> > + data_size = btf->type_offs_cap * sizeof(*new_type_offs);
> > +
> > + maps = (__u32 *)malloc(type_cnt * sizeof(__u32));
> > + if (!maps) {
> > + ret = -ENOMEM;
> > + goto err_out;
> > + }
> > +
> > + new_type_offs = (__u32 *)malloc(data_size);
> > + if (!new_type_offs) {
> > + ret = -ENOMEM;
> > + goto err_out;
> > + }
> > +
> > + new_type_offs_noname = (__u32 *)malloc(data_size);
> > + if (!new_type_offs_noname) {
> > + ret = -ENOMEM;
> > + goto err_out;
> > + }
>
> What is the point of separating offsets in new_type_offs vs
> new_type_offs_noname? It should be possible to use a single offsets
> array and have a comparison function that puts all named types before
> unnamed.

Great, you are right.

>
> > +
> > + new_types_data = malloc(btf->types_data_cap);
> > + if (!new_types_data) {
> > + ret = -ENOMEM;
> > + goto err_out;
> > + }
> > +
> > + memset(new_type_offs, 0, data_size);
> > +
> > + for (i = 0, j = 0, k = 0; i < type_cnt; i++) {
> > + const char *name;
> > +
> > + bt = (struct btf_type *)(btf->types_data + btf->type_offs[i]);
> > + name = btf__str_by_offset(btf, bt->name_off);
> > + if (!name || !name[0])
> > + new_type_offs_noname[k++] = btf->type_offs[i];
> > + else
> > + new_type_offs[j++] = btf->type_offs[i];
> > + }
> > +
> > + memmove(new_type_offs + j, new_type_offs_noname, sizeof(__u32) * k);
> > +
> > + qsort_r(new_type_offs, j, sizeof(*new_type_offs),
> > + btf_compare_type_name, btf);
> > +
> > + for (i = 0; i < type_cnt; i++) {
> > + found_offs = bsearch(&new_type_offs[i], btf->type_offs, type_cnt,
> > + sizeof(__u32), btf_compare_offs);
> > + if (!found_offs) {
> > + ret = -EINVAL;
> > + goto err_out;
> > + }
> > + maps[found_offs - btf->type_offs] = i;
> > + }
> > +
> > + loc_data = new_types_data;
> > + for (i = 0; i < type_cnt; i++, loc_data += type_size) {
> > + bt = (struct btf_type *)(btf->types_data + new_type_offs[i]);
> > + type_size = btf_type_size(bt);
> > + if (type_size < 0) {
> > + ret = type_size;
> > + goto err_out;
> > + }
> > +
> > + memcpy(loc_data, bt, type_size);
> > +
> > + bt = (struct btf_type *)loc_data;
> > + switch (btf_kind(bt)) {
>
> Please take a look at btf_dedup_remap_types(): it uses newly added
> iterator interface to enumerate all ID references in the type.
> It could be used here to avoid enumerating every BTF kind.
> Also, the d->hypot_map could be used instead of `maps`.
> And if so, I think that it should be possible to put this pass before
> btf_dedup_remap_types() in order for it to do the remapping.

Thank you. I will revise the code.

>
> Alternatively, it might make sense to merge this pass with
> btf_dedup_compact_types() in order to minimize number of operations,
> e.g. as in my crude attempt:
> https://github.com/eddyz87/bpf/tree/binsort-btf-dedup

Thank you. I would refer to your patch.

> (fails with similar selftests issue).

In addition to the bug in my patch, I have also identified a bug in
linker_fixup_btf
in the libbpf. After resolving the issue, the selftests successfully
passed, and I will
create a new patch to address the bug.

>
> > + case BTF_KIND_PTR:
> > + case BTF_KIND_CONST:
> > + case BTF_KIND_VOLATILE:
> > + case BTF_KIND_RESTRICT:
> > + case BTF_KIND_TYPEDEF:
> > + case BTF_KIND_TYPE_TAG:
> > + case BTF_KIND_FUNC:
> > + case BTF_KIND_VAR:
> > + case BTF_KIND_DECL_TAG:
> > + bt->type = btf_get_mapped_type(btf, maps, bt->type);
> > + break;
>
> [...]