Re: [RFC PATCH v3] bpf: Using binary search to improve the performance of btf_find_by_name_kind
From: Eduard Zingerman
Date: Mon Jun 17 2024 - 18:27:23 EST
On Mon, 2024-06-17 at 15:18 +0100, Alan Maguire wrote:
[...]
> If the plan is to fold the sorting into dedup, pahole will inherit it by
> default I suppose. Would it be worth making sorting optional (or at
> least providing a way to switch if off) via a dedup_opts option? If we
> had an on/off switch we could control sorting via a --btf_features
> option to pahole.
I'd avoid adding more flags if not absolutely necessary.
> One thing we lose with sorting is that currently the base and often-used
> types tend to cluster at initial BTF ids, so in some cases linear
> searches find what they're looking for pretty quickly. Would it be worth
> maintaining a name-sorted index for BTF perhaps? That would mean not
> changing type id order (so linear search is unaffected), but for
> btf_find_by_name_kind() searches the index could be used.
Instrumented kernel code as follows:
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -555,10 +555,13 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
continue;
tname = btf_name_by_offset(btf, t->name_off);
- if (!strcmp(tname, name))
+ if (!strcmp(tname, name)) {
+ printk("btf_find_by_name_kind: kind=%d, name='%s', id=%d\n", kind, name, i);
return i;
+ }
}
+ printk("btf_find_by_name_kind: kind=%d, name='%s', id=-1\n", kind, name);
return -ENOENT;
}
And analyzed the log generated when test_progs are run.
The summary of results is as follows:
| # of lookups | kind | name | id |
|--------------+------+---------------------+--------|
| 3480 | 4 | bpf_refcount | -1 |
| 3439 | 4 | bpf_rb_root | -1 |
| 3434 | 4 | bpf_rb_node | -1 |
| 3340 | 4 | bpf_list_head | -1 |
| 3339 | 4 | bpf_list_node | -1 |
| 3165 | 4 | bpf_spin_lock | -1 |
| 759 | 4 | foo | 30 |
| 659 | 4 | bpf_cpumask | 65569 |
| 194 | 4 | prog_test_ref_kfunc | 29619 |
| 146 | 4 | bpf_spin_lock | 8 |
| 123 | 4 | bpf_list_node | 31 |
| 123 | 4 | bpf_list_head | 11 |
| 116 | 4 | bar | 38 |
| ... 59 rows, 10<N<100 lookups each ... |
| ... 227 rows, N<10 lookups each ... |
(24680 lookups in total)
I'd say that 'bpf_spin_lock', 'bpf_list_node' and 'bpf_list_head'
could be considered as types that would be found quickly by the linear
search. Their total share is ~1.6% of all lookups. I don't think we
should add a special case for such low number of "hot" cases.
Also, the share of -1 results is surprising.
[...]