Re: perf test BPF failing on f24: fix

From: Arnaldo Carvalho de Melo
Date: Fri Aug 05 2016 - 11:55:17 EST


Em Fri, Aug 05, 2016 at 11:35:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Aug 05, 2016 at 06:45:50PM +0900, Masami Hiramatsu escreveu:
> > See the symbol address calcurated from symbol map, in successful case
> > the address exactly same address of SyS_epoll_wait. This indicates
> > something might wrong in the symbol map. (maybe KASLR?)
> > Could you check what happen if nokaslr is passed to your kernel?

> [acme@jouet linux]$ grep RANDOMIZE_BASE ../build/v4.7.0+/.config
> CONFIG_RANDOMIZE_BASE=y

> Yes, it is in place, I'll check rebooting with nokaslr.

> Can you think about how to properly warn the user about this
> possibility, if it is confirmed to cause this issue?

> I wonder why this doesn't seem to affect annotation, that also uses vmlinux,
> for instance:

> [root@jouet ~]# perf record --all-kernel -F 10000 ping -f -c 100000 localhost
> PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.

> --- localhost.localdomain ping statistics ---
> 100000 packets transmitted, 100000 received, 0% packet loss, time 732ms
> rtt min/avg/max/mdev = 0.003/0.003/0.048/0.002 ms, ipg/ewma 0.007/0.003 ms
> [ perf record: Woken up 2 times to write data ]
> [ perf record: Captured and wrote 0.298 MB perf.data (7340 samples) ]
> [root@jouet ~]# perf report -v | grep '\[' | head -10
> Using /lib/modules/4.7.0+/build/vmlinux for symbols
> 20.42% ping /lib/modules/4.7.0+/build/vmlinux 0x960374 v [k] ipt_do_table
> 2.55% ping /lib/modules/4.7.0+/build/vmlinux 0x900b8a v [k] nf_iterate
> 1.94% ping /lib/modules/4.7.0+/build/vmlinux 0x95190b v [k] fib_table_lookup
> 1.68% ping /lib/modules/4.7.0+/build/vmlinux 0x2accfa v [k] __local_bh_enable_ip
> 1.68% ping /lib/modules/4.7.0+/build/vmlinux 0x9e41ab v [k] _raw_spin_lock_irqsave
> 1.66% ping /lib/modules/4.7.0+/build/vmlinux 0x55fd94 v [k] avc_has_perm
> 1.65% ping /lib/modules/4.7.0+/build/vmlinux 0x90e3e2 v [k] __ip_append_data.isra.44
> 1.31% ping /lib/modules/4.7.0+/build/vmlinux 0x5ead59 v [k] copy_user_enhanced_fast_string
> 1.21% ping /lib/modules/4.7.0+/build/vmlinux 0x938820 v [k] raw_local_deliver
> 1.14% ping /lib/modules/4.7.0+/build/vmlinux 0x6c6cea v [k] n_tty_write
> [root@jouet ~]#

> [root@jouet ~]# perf annotate -v --stdio ipt_do_table
> Looking at the vmlinux_path (8 entries long)
> Using /lib/modules/4.7.0+/build/vmlinux for symbols
> symbol__disassemble: filename=/lib/modules/4.7.0+/build/vmlinux, sym=ipt_do_table, start=0xffffffffbd7601c0, end=0xffffffffbd7607b0
> annotating [0x3cef9b0] /lib/modules/4.7.0+/build/vmlinux : [0x4159750] ipt_do_table
> Executing: objdump --start-address=0xffffffff817601c0 --stop-address=0xffffffff817607b0 -l -d --no-show-raw -S -C /lib/modules/4.7.0+/build/vmlinux 2>/dev/nu
> Percent | Source code & Disassembly of vmlinux for cycles:ppp (1498 samples)
> ----------------------------------------------------------------------------------
> <SNIP>

> [root@jouet ~]# grep ffffffffbd7601c0 /proc/kallsyms
> ffffffffbd7601c0 T ipt_do_table
> [root@jouet ~]#

> Anyway, will reboot this machine with nokaslr to check that possibility.

So, more info, but executive summary: distro kernels don't exhibit this
problem, but they have CONFIG_RANDOMIZE_BASE=y, the kernel where I
noticed this problem (4.7.0+) doesn't exhibit this problem when used
with 'nokaslr', but I think more investigation is needed to get to the
bottom of this, i.e. to emit meaningful messages when kaslr is in place
(or is suspected to be):

Using a distro kernel:

[root@jouet ~]# uname -r
4.6.4-301.fc24.x86_64
[root@jouet ~]# perf probe sys_epoll_wait
Added new event:
probe:sys_epoll_wait (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait -aR sleep 1

[root@jouet ~]# rpm -q kernel-debuginfo
kernel-debuginfo-4.6.3-300.fc24.x86_64
[root@jouet ~]# uname -r
4.6.4-301.fc24.x86_64
[root@jouet ~]#

So it must be using just /proc/kallsyms, since there is no matching vmlinux:

[root@jouet ~]# perf probe -d *:*
Removed event: probe:sys_epoll_wait
[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Could not open debuginfo. Try to use symbols.
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel object code
Using /proc/kallsyms for symbols
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2698592
Added new event:
probe:sys_epoll_wait (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait -aR sleep 1

[root@jouet ~]#

Ok, installing a matching kernel vmlinux:

[root@jouet ~]# perf probe -d *:*
Removed event: probe:sys_epoll_wait
[root@jouet ~]# dnf debuginfo-install kernel

Trying again:

[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux for symbols
Open Debuginfo file: /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81292d60
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff812930f3
Probe point found: compat_SyS_epoll_pwait+147
found inline addr: 0xffffffff81292ed6
Probe point found: SyS_epoll_pwait+134
found inline addr: 0xffffffff81292d60
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2699507
Writing event: p:probe/sys_epoll_wait_1 _text+2698966
Writing event: p:probe/sys_epoll_wait_2 _text+2698592
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

So, a distro kernel works with or without vmlinux, but then...

[root@jouet ~]# grep RANDOMIZE_BASE /boot/config-4.6.4-301.fc24.x86_64
CONFIG_RANDOMIZE_BASE=y
CONFIG_RANDOMIZE_BASE_MAX_OFFSET=0x40000000
[root@jouet ~]#

It also has kaslr :-\ Rebooting with the kernel where the problem happened,
this time with nokaslr:

[root@jouet ~]# uname -r
4.7.0+
[root@jouet ~]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.7.0+ root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet LANG=en_US.UTF-8 nokaslr
[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81295b50
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff81295ee7
Probe point found: compat_SyS_epoll_pwait+151
found inline addr: 0xffffffff81295cca
Probe point found: SyS_epoll_pwait+138
found inline addr: 0xffffffff81295b50
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2711271
Writing event: p:probe/sys_epoll_wait_1 _text+2710730
Writing event: p:probe/sys_epoll_wait_2 _text+2710352
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

It works, but why was annotation worked when "nokaslr" wasn't used?

- Arnaldo