Re: [PATCH 04/31] perf record, bpf: Create probe points for BPF programs

From: Wangnan (F)
Date: Tue Oct 20 2015 - 23:33:47 EST




On 2015/10/21 3:12, Arnaldo Carvalho de Melo wrote:
Em Wed, Oct 14, 2015 at 12:41:15PM +0000, Wang Nan escreveu:
This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names a BPF program. It parses
the section names in the program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work. The resuling 'struct perf_probe_event'
is stored into program private data for further using.

By utilizing the new probing API, this patch creates probe points during
event parsing.

To ensure probe points be removed correctly, register an atexit hook
so even perf quit through exit() bpf__clear() is still called, so probing
points are cleared. Note that bpf_clear() should be registered before
bpf__probe() is called, so failure of bpf__probe() can still trigger
bpf__clear() to remove probe points which are already probed.

strerror style error reporting scaffold is created by this patch.
bpf__strerror_probe() is the first error reporting function in bpf-loader.c.
So, this one, for a non-root user gives me:

[acme@felicio linux]$ perf record --event /tmp/foo.o sleep 1
event syntax error: '/tmp/foo.o'
\___ Invalid argument

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]

-e, --event <event> event selector. use 'perf list' to list available events
[acme@felicio linux]$

--------------------

I.e. no libbpf error (good!) but then, just an -EINVAL as the "event syntax
error", which clearly isn't a syntax error, we need to tell the user that he or she
needs special perfmissions for using sys_bpf() :-)

As root:

[root@felicio ~]# perf record --event /tmp/foo.o sleep
event syntax error: '/tmp/foo.o'
\___ Invalid argument

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]

-e, --event <event> event selector. use 'perf list' to list available events
[root@felicio ~]# ls -la /tmp/foo.o
-rw-rw-r--. 1 acme acme 824 Oct 20 12:35 /tmp/foo.o
[root@felicio ~]# file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped


Humm, its something else, this is an ancient kernel, 4.2.0, probably without
eBPF support? Nope, its there:

[root@felicio ~]# grep -i sys_bpf /proc/kallsyms
ffffffff811829d0 T SyS_bpf
ffffffff811829d0 T sys_bpf
[root@felicio ~]#

Its something else, we need to improve this error reporting:

[root@felicio ~]# perf record -v --event /tmp/foo.o sleep 1
libbpf: loading /tmp/foo.o
libbpf: section .strtab, size 60, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section .data, size 0, link 0, flags 3, type=1
libbpf: section .bss, size 0, link 0, flags 3, type=8
libbpf: section do_fork, size 16, link 0, flags 6, type=1
libbpf: found program do_fork
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of /tmp/foo.o is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of /tmp/foo.o is 40100
libbpf: section .symtab, size 96, link 1, flags 0, type=2
bpf: config program 'do_fork'
symbol:do_fork file:(null) line:0 offset:0 return:0 lazy:(null)
bpf: 'do_fork': event name is missing

BPF report the problem, but it is a little bit hard to understand...

event syntax error: '/tmp/foo.o'
\___ Invalid argument

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]

-e, --event <event> event selector. use 'perf list' to list available events
[root@felicio ~]#

[root@felicio ~]# grep do_fork /proc/kallsyms
ffffffff81099ab0 T _do_fork
ffffffff81ccc800 d do_fork_test
[root@felicio ~]#

$ echo '__attribute__((section("_do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c - -target bpf -O2 -o /tmp/foo.o

In your program you only provide "do_fork", but we need "key=value" syntax.
"key" will become the name of created kprobe. Please try "__attribute__((section("func=do_fork"), used)) "
instead.

I think when event name is missing we'd better construct one name for it
like perf probe, but then we need to deal with perf probe code again. It
may require another patch.

For this patch, I think we can assign a new errorno so bpf__strerror_probe()
can give more information to let user know whether the problem is reside in bpf
program or perf configuration. Do you think ENOEXEC is a good choice?

Thank you.


[root@felicio ~]# perf record -v --event /tmp/foo.o sleep 1
libbpf: loading /tmp/foo.o
libbpf: section .strtab, size 61, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section .data, size 0, link 0, flags 3, type=1
libbpf: section .bss, size 0, link 0, flags 3, type=8
libbpf: section _do_fork, size 16, link 0, flags 6, type=1
libbpf: found program _do_fork
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of /tmp/foo.o is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of /tmp/foo.o is 40100
libbpf: section .symtab, size 96, link 1, flags 0, type=2
bpf: config program '_do_fork'
symbol:_do_fork file:(null) line:0 offset:0 return:0 lazy:(null)
bpf: '_do_fork': event name is missing
event syntax error: '/tmp/foo.o'
\___ Invalid argument

(add -v to see detail)
Run 'perf list' for a list of valid events

Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]

-e, --event <event> event selector. use 'perf list' to list available events
[root@felicio ~]#

So it still doesn't work, doesn't look like it is trying to find a vmlinux,
will look at another patch IIRC is in this patchkit allowing us to tell
'perf record' where to find it... But it can as well use kallsyms...

- Arnaldo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/