Re: perf test BPF failing on f24: fix

From: Arnaldo Carvalho de Melo
Date: Fri Aug 05 2016 - 10:35:42 EST


Em Fri, Aug 05, 2016 at 06:45:50PM +0900, Masami Hiramatsu escreveu:
> On Thu, 4 Aug 2016 18:47:24 -0300
> Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> > Em Thu, Aug 04, 2016 at 04:36:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > So:
> > >
> > > int err = debuginfo__get_text_offset(dbg, &baseaddr);
> > >
> > > is returning 0, no relocation, its dwarf_addrdie() that is not finding
> > > SyS_epoll_wait from its address.
> > >
> > > Trying to figure out why dwarf_addrdie(0xffffffffbd295b50) fails...
> >
> > So, trying to use that vmlinux with objdump to do disassembly I found
> > that I need to do some offsetting, and after calculating it, this made
> > things works for me:
> >
> > diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> > index f2d9ff064e2d..9b95754f28ed 100644
> > --- a/tools/perf/util/probe-finder.c
> > +++ b/tools/perf/util/probe-finder.c
> > @@ -1486,6 +1486,8 @@ retry:
> > /* Find cu die */
> > if (!dwarf_addrdie(dbg->dbg, (Dwarf_Addr)addr, &cudie)) {
> > if (!reloc && debuginfo__get_text_offset(dbg, &baseaddr) == 0) {
> > + if (baseaddr == 0)
> > + baseaddr = -0x3c000000;
>
> Hmm, strange... what's this magic number ...?
> Actually debuginfo__get_text_offset() is for kernel modules and it relocate the offset.

I see no mention to "kernel modules" in this function, and it is there
that the message is being emitted :-\

> But I guess sys_epoll_wait is in the kernel.

sure

> > addr += baseaddr;
> > reloc = true;
> > goto retry;
> >
> > ----------------
> >
> > With it, which is not a proper fix, of course, we get:
> >
> > [root@jouet ~]# perf probe sys_epoll_wait
> > Added new events:
> > probe:sys_epoll_wait (on sys_epoll_wait)
> > probe:sys_epoll_wait_1 (on sys_epoll_wait)
> > probe:sys_epoll_wait_2 (on sys_epoll_wait)
> >
> > You can now use it in all perf tools, such as:
> >
> > perf record -e probe:sys_epoll_wait_2 -aR sleep 1
> >
> > [root@jouet ~]#
> >
> > Which, using -v shows it "finding" the alias:
> >
> > [root@jouet ~]# perf probe -v sys_epoll_wait
> > probe-definition(0): sys_epoll_wait
> > symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
> > 0 arguments
> > Looking at the vmlinux_path (8 entries long)
> > Using /lib/modules/4.7.0+/build/vmlinux for symbols
> > Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
> > Try to find probe point from debuginfo.
> > Symbol sys_epoll_wait address found : ffffffffbd295b50
> > Matched function: SyS_epoll_wait
> > found inline addr: 0xffffffff81295ee7
> > Probe point found: compat_SyS_epoll_pwait+151
> > found inline addr: 0xffffffff81295cca
> > Probe point found: SyS_epoll_pwait+138
> > found inline addr: 0xffffffff81295b50
> > Probe point found: SyS_epoll_wait+0
>
> Mine shows
> ----
> Using /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux for symbols
> Open Debuginfo file: /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux
> Try to find probe point from debuginfo.
> Symbol sys_epoll_wait address found : ffffffff81292d60 <-(*)
> Matched function: SyS_epoll_wait
> found inline addr: 0xffffffff812930f3
> Probe point found: compat_SyS_epoll_pwait+147
> found inline addr: 0xffffffff81292ed6
> Probe point found: SyS_epoll_pwait+134
> found inline addr: 0xffffffff81292d60
> Probe point found: SyS_epoll_wait+0
> Found 3 probe_trace_events.
> -----
>
> See the symbol address calcurated from symbol map, in successful case
> the address exactly same address of SyS_epoll_wait. This indicates
> something might wrong in the symbol map. (maybe KASLR?)
> Could you check what happen if nokaslr is passed to your kernel?

[acme@jouet linux]$ grep RANDOMIZE_BASE ../build/v4.7.0+/.config
CONFIG_RANDOMIZE_BASE=y

Yes, it is in place, I'll check rebootint with nokaslr.

Can you think about how to properly warn the user about this
possibility, if it is confirmed to cause this issue?

I wonder why this doesn't seem to affect annotation, that also uses vmlinux,
for instance:

[root@jouet ~]# perf record --all-kernel -F 10000 ping -f -c 100000 localhost
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.

--- localhost.localdomain ping statistics ---
100000 packets transmitted, 100000 received, 0% packet loss, time 732ms
rtt min/avg/max/mdev = 0.003/0.003/0.048/0.002 ms, ipg/ewma 0.007/0.003 ms
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.298 MB perf.data (7340 samples) ]
[root@jouet ~]# perf report -v | grep '\[' | head -10
Using /lib/modules/4.7.0+/build/vmlinux for symbols
20.42% ping /lib/modules/4.7.0+/build/vmlinux 0x960374 v [k] ipt_do_table
2.55% ping /lib/modules/4.7.0+/build/vmlinux 0x900b8a v [k] nf_iterate
1.94% ping /lib/modules/4.7.0+/build/vmlinux 0x95190b v [k] fib_table_lookup
1.68% ping /lib/modules/4.7.0+/build/vmlinux 0x2accfa v [k] __local_bh_enable_ip
1.68% ping /lib/modules/4.7.0+/build/vmlinux 0x9e41ab v [k] _raw_spin_lock_irqsave
1.66% ping /lib/modules/4.7.0+/build/vmlinux 0x55fd94 v [k] avc_has_perm
1.65% ping /lib/modules/4.7.0+/build/vmlinux 0x90e3e2 v [k] __ip_append_data.isra.44
1.31% ping /lib/modules/4.7.0+/build/vmlinux 0x5ead59 v [k] copy_user_enhanced_fast_string
1.21% ping /lib/modules/4.7.0+/build/vmlinux 0x938820 v [k] raw_local_deliver
1.14% ping /lib/modules/4.7.0+/build/vmlinux 0x6c6cea v [k] n_tty_write
[root@jouet ~]#

[root@jouet ~]# perf annotate -v --stdio ipt_do_table
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
symbol__disassemble: filename=/lib/modules/4.7.0+/build/vmlinux, sym=ipt_do_table, start=0xffffffffbd7601c0, end=0xffffffffbd7607b0
annotating [0x3cef9b0] /lib/modules/4.7.0+/build/vmlinux : [0x4159750] ipt_do_table
Executing: objdump --start-address=0xffffffff817601c0 --stop-address=0xffffffff817607b0 -l -d --no-show-raw -S -C /lib/modules/4.7.0+/build/vmlinux 2>/dev/nu
Percent | Source code & Disassembly of vmlinux for cycles:ppp (1498 samples)
----------------------------------------------------------------------------------
<SNIP>

[root@jouet ~]# grep ffffffffbd7601c0 /proc/kallsyms
ffffffffbd7601c0 T ipt_do_table
[root@jouet ~]#

Anyway, will reboot this machine with nokaslr to check that possibility.

- Arnaldo

> Thank you,
>
> > Found 3 probe_trace_events.
> > Opening /sys/kernel/debug/tracing//kprobe_events write=1
> > Writing event: p:probe/sys_epoll_wait _text+2711271
> > Writing event: p:probe/sys_epoll_wait_1 _text+2710730
> > Writing event: p:probe/sys_epoll_wait_2 _text+2710352
> > Added new events:
> > probe:sys_epoll_wait (on sys_epoll_wait)
> > probe:sys_epoll_wait_1 (on sys_epoll_wait)
> > probe:sys_epoll_wait_2 (on sys_epoll_wait)
> >
> > You can now use it in all perf tools, such as:
> >
> > perf record -e probe:sys_epoll_wait_2 -aR sleep 1
> >
> > [root@jouet ~]#
> >
> > ----------
> >
> > Now to figure out why baseaddr is returning as zero...
> >
> > - Arnaldo
>
>
> --
> Masami Hiramatsu <mhiramat@xxxxxxxxxx>