'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

From: Arnaldo Carvalho de Melo
Date: Tue Nov 28 2017 - 14:05:28 EST


So, I had 'perf test BPF' failing in perf/core, at first thought it was
due to an update I had made to clang, but then I noticed that with
perf/urgent it works, so something else, did a bisection and got to:

[acme@jouet linux]$ git bisect bad
88cda1c9da02c8aa31e1d5dcf22e8a35cc8c19f2 is the first bad commit
commit 88cda1c9da02c8aa31e1d5dcf22e8a35cc8c19f2
Author: Martin KaFai Lau <kafai@xxxxxx>
Date: Wed Sep 27 14:37:54 2017 -0700

bpf: libbpf: Provide basic API support to specify BPF obj name

This patch extends the libbpf to provide API support to
allow specifying BPF object name.

In tools/lib/bpf/libbpf, the C symbol of the function
and the map is used. Regarding section name, all maps are
under the same section named "maps". Hence, section name
is not a good choice for map's name. To be consistent with
map, bpf_prog also follows and uses its function symbol as
the prog's name.

This patch adds logic to collect function's symbols in libbpf.
There is existing codes to collect the map's symbols and no change
is needed.

The bpf_load_program_name() and bpf_map_create_name() are
added to take the name argument. For the other bpf_map_create_xxx()
variants, a name argument is directly added to them.

In samples/bpf, bpf_load.c in particular, the symbol is also
used as the map's name and the map symbols has already been
collected in the existing code. For bpf_prog, bpf_load.c does
not collect the function symbol name. We can consider to collect
them later if there is a need to continue supporting the bpf_load.c.

Signed-off-by: Martin KaFai Lau <kafai@xxxxxx>
Acked-by: Alexei Starovoitov <ast@xxxxxx>
Acked-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>

:040000 040000 9082e2747e39319fcd1555506bc24f36fc85ec43 433c27b777924b188028d79d2c5648a917135f39 M samples
:040000 040000 6113e47f92559d3b4ac4eda399d2c980ac407603 9774db0bb47eab0ffc642623bd7d7a6f46b820c6 M tools
[acme@jouet linux]$
[acme@jouet linux]$

Please CC me and Wang Nan when making changes to tools/lib/bpf/,
as it started having tools/perf/ as its sole user, i.e. there is code
there that uses it and we have to make sure it continues working :-\

What fails?

[root@jouet ~]# perf test bpf
39: BPF filter :
39.1: Basic BPF filtering : FAILED!
39.2: BPF pinning : Skip
39.3: BPF prologue generation : Skip
39.4: BPF relocation checker : Skip
[root@jouet ~]#

With -v we can see that it is map setup that fails, with the error being:

libbpf: failed to create map (name: 'flip_table'): Invalid argument
libbpf: failed to load object '[basic_bpf_test]'
bpf: load objects failed
Failed to add events selected by BPF

Full logs, see below for further info I had collected before doing the bisect:

[root@jouet ~]# perf test -v bpf
39: BPF filter :
39.1: Basic BPF filtering :
--- start ---
test child forked, pid 12198
Kernel build dir is set to /lib/modules/4.14.0+/build
set env: KBUILD_DIR=/lib/modules/4.14.0+/build
unset env: KBUILD_OPTS
include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
set env: CLANG_OPTIONS=-xc
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
set env: WORKING_DIR=/lib/modules/4.14.0+/build
set env: CLANG_SOURCE=-
llvm compiling command template: echo '/*
* bpf-script-example.c
* Test basic LLVM building
*/
#ifndef LINUX_VERSION_CODE
# error Need LINUX_VERSION_CODE
# error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
#endif
#define BPF_ANY 0
#define BPF_MAP_TYPE_ARRAY 2
#define BPF_FUNC_map_lookup_elem 1
#define BPF_FUNC_map_update_elem 2

static void *(*bpf_map_lookup_elem)(void *map, void *key) =
(void *) BPF_FUNC_map_lookup_elem;
static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
(void *) BPF_FUNC_map_update_elem;

struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};

#define SEC(NAME) __attribute__((section(NAME), used))
struct bpf_map_def SEC("maps") flip_table = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = 1,
};

SEC("func=SyS_epoll_wait")
int bpf_func__SyS_epoll_wait(void *ctx)
{
int ind =0;
int *flag = bpf_map_lookup_elem(&flip_table, &ind);
int new_flag;
if (!flag)
return 0;
/* flip flag and store back */
new_flag = !*flag;
bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
return new_flag;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object '[basic_bpf_test]' from buffer
libbpf: section .strtab, size 120, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section func=SyS_epoll_wait, size 192, link 0, flags 6, type=1
libbpf: found program func=SyS_epoll_wait
libbpf: section .relfunc=SyS_epoll_wait, size 32, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of [basic_bpf_test] is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of [basic_bpf_test] is 40e00
libbpf: section .symtab, size 168, link 1, flags 0, type=2
libbpf: maps in [basic_bpf_test]: 1 maps in 16 bytes
libbpf: map 0 is "flip_table"
libbpf: collecting relocating info for: 'func=SyS_epoll_wait'
libbpf: relocation: insn_idx=4
libbpf: relocation: find map 0 (flip_table) for insn 4
libbpf: relocation: insn_idx=17
libbpf: relocation: find map 0 (flip_table) for insn 17
bpf: config program 'func=SyS_epoll_wait'
symbol:SyS_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
bpf: config 'func=SyS_epoll_wait' is ok
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.14.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: SyS_epoll_wait [2e5a9be]
found inline addr: 0xffffffff812af256
Probe point found: compat_SyS_epoll_pwait+150
found inline addr: 0xffffffff812af037
Probe point found: SyS_epoll_pwait+135
found inline addr: 0xffffffff812aeed0
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: p:perf_bpf_probe/func _text+2814550
Writing event: p:perf_bpf_probe/func_1 _text+2814007
Writing event: p:perf_bpf_probe/func_2 _text+2813648
libbpf: failed to create map (name: 'flip_table'): Invalid argument
libbpf: failed to load object '[basic_bpf_test]'
bpf: load objects failed
Failed to add events selected by BPF
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/func _text+2814550
Group:perf_bpf_probe Event:func probe:p
Parsing probe_events: p:perf_bpf_probe/func_1 _text+2814007
Group:perf_bpf_probe Event:func_1 probe:p
Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648
Group:perf_bpf_probe Event:func_2 probe:p
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: -:perf_bpf_probe/func
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/func_1 _text+2814007
Group:perf_bpf_probe Event:func_1 probe:p
Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648
Group:perf_bpf_probe Event:func_2 probe:p
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: -:perf_bpf_probe/func_1
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648
Group:perf_bpf_probe Event:func_2 probe:p
Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval
Group:probe Event:check_stack_object probe:r
Writing event: -:perf_bpf_probe/func_2
test child finished with -1
---- end ----
BPF filter subtest 0: FAILED!
39.2: BPF pinning :
--- force skipped ---
BPF filter subtest 1: Skip
39.3: BPF prologue generation :
--- force skipped ---
BPF filter subtest 2: Skip
39.4: BPF relocation checker :
--- force skipped ---
BPF filter subtest 3: Skip
[root@jouet ~]#

Em Mon, Nov 27, 2017 at 04:34:25PM -0300, Arnaldo Carvalho de Melo escreveu:
> So, I noticed that any maps are failing, I'll go dig, but may be some
> new security tightening, even running as root, this was working
> recently, was even part of our discussion on the bpf_probe_read_str()
> trouble with clang's optimizer:
>
> [root@jouet bpf]# cat open.c
> #include "bpf.h"
>
> SEC("prog=do_sys_open filename")
> int prog(void *ctx, int err, char *filename_ptr)
> {
> char filename[128];
> int len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
> if (len > 0) {
> if (len == 1)
> perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
> else if (len < 128)
> perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, filename, len);
> }
> return 1;
> }
> [root@jouet bpf]#
> <SNIP>
> Found 1 probe_trace_events.
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Writing event: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
> In map_prologue, ntevs=1
> mapping[0]=0
> libbpf: failed to create map (name: '__bpf_stdout__'): Invalid argument
> libbpf: failed to load object 'open.c'
> bpf: load objects failed
> event syntax error: 'open.c'
> \___ Operation not permitted
> <SNIP>
>
> Using 'perf ftrace' to trace just 'perf trace':
>
> [root@jouet bpf]# perf ftrace -G SyS_bpf perf trace -v -e open.c,open cat /tmp/somefile 2> /dev/null
> 0) | SyS_bpf() {
> 0) | capable() {
> 0) | ns_capable_common() {
> 0) | security_capable() {
> 0) 0.045 us | cap_capable();
> 0) | selinux_capable() {
> 0) 0.274 us | cred_has_capability();
> 0) 0.518 us | }
> 0) 1.464 us | }
> 0) 1.783 us | }
> 0) 2.130 us | }
> 0) 0.458 us | check_uarg_tail_zero();
> 0) | __check_object_size() {
> 0) 0.046 us | __virt_addr_valid();
> 0) 0.040 us | check_stack_object();
> 0) 0.510 us | }
> 0) 4.161 us | }
> [root@jouet bpf]#
>
> /me goes to look at SyS_bpf() in this kernel... (4.14.0+).

Tracing 'perf trace' with 'perf trace' we see:

# perf trace -e bpf perf trace -e open.c,open cat /tmp/somefile
<SNIP traced 'perf trace' error messages>
0.000 ( 0.015 ms): perf/16767 bpf(cmd: MAP_CREATE, uattr: 0x7ffc3c8c7ac0, size: 72) = -1 EINVAL Invalid argument
#

Humm,

# perf probe check_stack_object%return 'ret=$retval'
Added new event:
probe:check_stack_object (on check_stack_object%return with ret=$retval)

You can now use it in all perf tools, such as:

perf record -e probe:check_stack_object -aR sleep 1

#

# perf trace -e bpf,probe:check* perf trace -e open.c,open cat /tmp/somefile
<SNIP lots of check_stack_object calls returning 0x0)
4626.779 ( 0.004 ms): perf/31498 bpf(cmd: MAP_CREATE, uattr: 0x7fff7dbbaab0, size: 72 ) ...
4626.784 ( ): probe:check_stack_object:(ffffffffb625ec30 <- ffffffffb625ed1f) ret=0x2)
4626.779 ( 0.006 ms): perf/31498 ... [continued]: bpf()) = -1 EINVAL Invalid argument
<SNIP lots of check_stack_object calls returning 0x0)

check_stack_object returning 0x2 means GOOD_STACK, 0x0 means
NOT_STACK...

- Arnaldo