Re: [PATCHv2 bpf 3/3] bpf: Force cookies array to follow symbols sorting

From: Jiri Olsa
Date: Wed Jun 08 2022 - 06:13:19 EST


On Wed, Jun 08, 2022 at 09:59:41AM +0200, Jiri Olsa wrote:
> On Tue, Jun 07, 2022 at 09:10:32PM -0700, Alexei Starovoitov wrote:
> > On Tue, Jun 7, 2022 at 12:56 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
> > >
> > > On Tue, Jun 07, 2022 at 11:40:47AM -0700, Alexei Starovoitov wrote:
> > > > On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> > > > >
> > > > > When user specifies symbols and cookies for kprobe_multi link
> > > > > interface it's very likely the cookies will be misplaced and
> > > > > returned to wrong functions (via get_attach_cookie helper).
> > > > >
> > > > > The reason is that to resolve the provided functions we sort
> > > > > them before passing them to ftrace_lookup_symbols, but we do
> > > > > not do the same sort on the cookie values.
> > > > >
> > > > > Fixing this by using sort_r function with custom swap callback
> > > > > that swaps cookie values as well.
> > > > >
> > > > > Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
> > > > > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> > > >
> > > > It looks good, but something in this patch is causing a regression:
> > > > ./test_progs -t kprobe_multi
> > > > test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
> > > > #80/1 kprobe_multi_test/skel_api:OK
> > > > #80/2 kprobe_multi_test/link_api_addrs:OK
> > > > #80/3 kprobe_multi_test/link_api_syms:OK
> > > > #80/4 kprobe_multi_test/attach_api_pattern:OK
> > > > #80/5 kprobe_multi_test/attach_api_addrs:OK
> > > > #80/6 kprobe_multi_test/attach_api_syms:OK
> > > > #80/7 kprobe_multi_test/attach_api_fails:OK
> > > > test_bench_attach:PASS:get_syms 0 nsec
> > > > test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
> > > > libbpf: prog 'test_kprobe_empty': failed to attach: No such process
> > > > test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts
> > > > unexpected error: -3
> > > > #80/8 kprobe_multi_test/bench_attach:FAIL
> > > > #80 kprobe_multi_test:FAIL
> > >
> > > looks like kallsyms search failed to find some symbol,
> > > but I can't reproduce with:
> > >
> > > ./vmtest.sh -- ./test_progs -t kprobe_multi
> > >
> > > can you share .config you used?
> >
> > I don't think it's config related.
> > Patch 2 is doing:
> >
> > - if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > + sym = bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp);
> > + if (!sym)
> > + return 0;
> > +
> > + idx = sym - args->syms;
> > + if (args->addrs[idx])
> > return 0;
> >
> > addr = ftrace_location(addr);
> > if (!addr)
> > return 0;
> >
> > - args->addrs[args->found++] = addr;
> > + args->addrs[idx] = addr;
> > + args->found++;
> > return args->found == args->cnt ? 1 : 0;
> >
> > There are plenty of functions with the same name
> > in available_filter_functions.
> > So
> > if (args->addrs[idx])
> > return 0;
> > triggers for a lot of them.
> > At the end args->found != args->cnt.
>
> there's code in get_syms (prog_tests/kprobe_multi_test.c)
> that filters out duplicates
>
> >
> > Here is trivial debug patch:
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index 601ccf1b2f09..c567cf56cb57 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -8037,8 +8037,10 @@ static int kallsyms_callback(void *data, const
> > char *name,
> > return 0;
> >
> > idx = sym - args->syms;
> > - if (args->addrs[idx])
> > + if (args->addrs[idx]) {
> > + printk("idx %x name %s\n", idx, name);
> > return 0;
> > + }
> >
> > addr = ftrace_location(addr);
> > if (!addr)
> > @@ -8078,6 +8080,7 @@ int ftrace_lookup_symbols(const char
> > **sorted_syms, size_t cnt, unsigned long *a
> > err = kallsyms_on_each_symbol(kallsyms_callback, &args);
> > if (err < 0)
> > return err;
> > + printk("found %zd cnt %zd\n", args.found, args.cnt);
> > return args.found == args.cnt ? 0 : -ESRCH;
> > }
> >
> > [ 13.096160] idx a500 name unregister_vclock
> > [ 13.096930] idx 82fb name pt_regs_offset
> > [ 13.106969] idx 92be name set_root
> > [ 13.107290] idx 4414 name event_function
> > [ 13.112570] idx 7d1d name phy_init
> > [ 13.114459] idx 7d13 name phy_exit
> > [ 13.114777] idx ab91 name watchdog
> > [ 13.115730] found 46921 cnt 47036
> >
> > I don't understand how it works for you at all.
> > It passes in BPF CI only because we don't run
> > kprobe_multi_test/bench_attach there (yet).
>
> reproduced after I updated the tree today.. not sure why I did
> not see that before, going to check

ok, I'm not completely crazy ;-) it's the weak functions fix:

b39181f7c690 ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function

I should have noticed this before (from changelog):

A worker thread is added at boot up to scan all the ftrace record entries,
and will mark any that fail the FTRACE_MCOUNT_MAX_OFFSET test as disabled.
They will still appear in the available_filter_functions file as:

__ftrace_invalid_address___<invalid-offset>

(showing the offset that caused it to be invalid).


Steven,
is there a reason to show '__ftrace_invalid_address___*' symbols in
available_filter_functions? it seems more like debug message to me

I can easily filter them out, but my assumption was that any symbol
in available_filter_functions could be resolved in /proc/kalsyms

with the workaround patch below the bench test is passing for me

thanks,
jirka


---
diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index 586dc52d6fb9..88086ac23ef3 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -364,6 +364,8 @@ static int get_syms(char ***symsp, size_t *cntp)
continue;
if (!strncmp(name, "rcu_", 4))
continue;
+ if (!strncmp(name, "__ftrace_invalid_address__", sizeof("__ftrace_invalid_address__") - 1))
+ continue;
err = hashmap__add(map, name, NULL);
if (err) {
free(name);