Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test

From: Alexei Starovoitov
Date: Thu Apr 28 2022 - 15:00:14 EST


On Thu, Apr 28, 2022 at 6:58 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Sat, 16 Apr 2022 23:21:03 +0900
> Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> > OK, I also confirmed that __bpf_tramp_exit is listed. (others seems no notrace)
> >
> > /sys/kernel/tracing # cat available_filter_functions | grep __bpf_tramp
> > __bpf_tramp_image_release
> > __bpf_tramp_image_put_rcu
> > __bpf_tramp_image_put_rcu_tasks
> > __bpf_tramp_image_put_deferred
> > __bpf_tramp_exit
> >
> > My gcc is older one.
> > gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
> >
> > But it seems that __bpf_tramp_exit() doesn't call __fentry__. (I objdump'ed)
> >
> > ffffffff81208270 <__bpf_tramp_exit>:
> > ffffffff81208270: 55 push %rbp
> > ffffffff81208271: 48 89 e5 mov %rsp,%rbp
> > ffffffff81208274: 53 push %rbx
> > ffffffff81208275: 48 89 fb mov %rdi,%rbx
> > ffffffff81208278: e8 83 70 ef ff callq ffffffff810ff300 <__rcu_read_lock>
> > ffffffff8120827d: 31 d2 xor %edx,%edx
>
> You need to look deeper ;-)
> >
> >
> > >
> > > So it's quite bizarre and inconsistent.
> >
> > Indeed. I guess there is a bug in scripts/recordmcount.pl.
>
> No there isn't.
>
> I added the addresses it was mapping and found this:
>
> ffffffffa828f680 T __bpf_tramp_exit
>
> (which is relocated, but it's trivial to map it with the actual function).
>
> At the end of that function we have:
>
> ffffffff8128f767: 48 8d bb e0 00 00 00 lea 0xe0(%rbx),%rdi
> ffffffff8128f76e: 48 8b 40 08 mov 0x8(%rax),%rax
> ffffffff8128f772: e8 89 28 d7 00 call ffffffff82002000 <__x86_indirect_thunk_array>
> ffffffff8128f773: R_X86_64_PLT32 __x86_indirect_thunk_rax-0x4
> ffffffff8128f777: e9 4a ff ff ff jmp ffffffff8128f6c6 <__bpf_tramp_exit+0x46>
> ffffffff8128f77c: 0f 1f 40 00 nopl 0x0(%rax)
> ffffffff8128f780: e8 8b df dc ff call ffffffff8105d710 <__fentry__>
> ffffffff8128f781: R_X86_64_PLT32 __fentry__-0x4
> ffffffff8128f785: b8 f4 fd ff ff mov $0xfffffdf4,%eax
> ffffffff8128f78a: c3 ret
> ffffffff8128f78b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>
>
> Notice the call to fentry!
>
> It's due to this:
>
> void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
> {
> percpu_ref_put(&tr->pcref);
> }
>
> int __weak
> arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
> const struct btf_func_model *m, u32 flags,
> struct bpf_tramp_progs *tprogs,
> void *orig_call)
> {
> return -ENOTSUPP;
> }
>
> The weak function gets a call to ftrace, but it still gets compiled into
> vmlinux but its symbol is dropped due to it being overridden. Thus, the
> mcount_loc finds this call to fentry, and maps it to the symbol that is
> before it, which just happened to be __bpf_tramp_exit.

Ouch. That _is_ a bug in recordmocount.

> I made that weak function "notrace" and the __bpf_tramp_exit disappeared
> from the available_filter_functions list.

That's a hack. We cannot rely on such hacks for all weak functions.