Re: [PATCH 04/31] perf record, bpf: Create probe points for BPF programs

From: Arnaldo Carvalho de Melo
Date: Wed Oct 21 2015 - 09:28:17 EST


Em Wed, Oct 21, 2015 at 11:31:57AM +0800, Wangnan (F) escreveu:
>
>
> On 2015/10/21 3:12, Arnaldo Carvalho de Melo wrote:
> >Em Wed, Oct 14, 2015 at 12:41:15PM +0000, Wang Nan escreveu:
> >>This patch introduces bpf__{un,}probe() functions to enable callers to
> >>create kprobe points based on section names a BPF program. It parses
> >>the section names in the program and creates corresponding 'struct
> >>perf_probe_event' structures. The parse_perf_probe_command() function is
> >>used to do the main parsing work. The resuling 'struct perf_probe_event'
> >>is stored into program private data for further using.
> >>
> >>By utilizing the new probing API, this patch creates probe points during
> >>event parsing.
> >>
> >>To ensure probe points be removed correctly, register an atexit hook
> >>so even perf quit through exit() bpf__clear() is still called, so probing
> >>points are cleared. Note that bpf_clear() should be registered before
> >>bpf__probe() is called, so failure of bpf__probe() can still trigger
> >>bpf__clear() to remove probe points which are already probed.
> >>
> >>strerror style error reporting scaffold is created by this patch.
> >>bpf__strerror_probe() is the first error reporting function in bpf-loader.c.
> >So, this one, for a non-root user gives me:
> >
> >[acme@felicio linux]$ perf record --event /tmp/foo.o sleep 1
> >event syntax error: '/tmp/foo.o'
> > \___ Invalid argument
> >
> >(add -v to see detail)
> >Run 'perf list' for a list of valid events
> >
> > Usage: perf record [<options>] [<command>]
> > or: perf record [<options>] -- <command> [<options>]
> >
> > -e, --event <event> event selector. use 'perf list' to list available events
> >[acme@felicio linux]$
> >
> >--------------------
> >
> >I.e. no libbpf error (good!) but then, just an -EINVAL as the "event syntax
> >error", which clearly isn't a syntax error, we need to tell the user that he or she
> >needs special perfmissions for using sys_bpf() :-)
> >
> >As root:
> >
> >[root@felicio ~]# perf record --event /tmp/foo.o sleep
> >event syntax error: '/tmp/foo.o'
> > \___ Invalid argument
> >
> >(add -v to see detail)
> >Run 'perf list' for a list of valid events
> >
> > Usage: perf record [<options>] [<command>]
> > or: perf record [<options>] -- <command> [<options>]
> >
> > -e, --event <event> event selector. use 'perf list' to list available events
> >[root@felicio ~]# ls -la /tmp/foo.o
> >-rw-rw-r--. 1 acme acme 824 Oct 20 12:35 /tmp/foo.o
> >[root@felicio ~]# file /tmp/foo.o
> >/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
> >
> >
> >Humm, its something else, this is an ancient kernel, 4.2.0, probably without
> >eBPF support? Nope, its there:
> >
> >[root@felicio ~]# grep -i sys_bpf /proc/kallsyms
> >ffffffff811829d0 T SyS_bpf
> >ffffffff811829d0 T sys_bpf
> >[root@felicio ~]#
> >
> >Its something else, we need to improve this error reporting:
> >
> >[root@felicio ~]# perf record -v --event /tmp/foo.o sleep 1
> >libbpf: loading /tmp/foo.o
> >libbpf: section .strtab, size 60, link 0, flags 0, type=3
> >libbpf: section .text, size 0, link 0, flags 6, type=1
> >libbpf: section .data, size 0, link 0, flags 3, type=1
> >libbpf: section .bss, size 0, link 0, flags 3, type=8
> >libbpf: section do_fork, size 16, link 0, flags 6, type=1
> >libbpf: found program do_fork
> >libbpf: section license, size 4, link 0, flags 3, type=1
> >libbpf: license of /tmp/foo.o is GPL
> >libbpf: section version, size 4, link 0, flags 3, type=1
> >libbpf: kernel version of /tmp/foo.o is 40100
> >libbpf: section .symtab, size 96, link 1, flags 0, type=2
> >bpf: config program 'do_fork'
> >symbol:do_fork file:(null) line:0 offset:0 return:0 lazy:(null)
> >bpf: 'do_fork': event name is missing
>
> BPF report the problem, but it is a little bit hard to understand...
>
> >event syntax error: '/tmp/foo.o'
> > \___ Invalid argument
> >
> >(add -v to see detail)
> >Run 'perf list' for a list of valid events
> >
> > Usage: perf record [<options>] [<command>]
> > or: perf record [<options>] -- <command> [<options>]
> >
> > -e, --event <event> event selector. use 'perf list' to list available events
> >[root@felicio ~]#
> >
> >[root@felicio ~]# grep do_fork /proc/kallsyms
> >ffffffff81099ab0 T _do_fork
> >ffffffff81ccc800 d do_fork_test
> >[root@felicio ~]#
> >
> >$ echo '__attribute__((section("_do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c - -target bpf -O2 -o /tmp/foo.o

> In your program you only provide "do_fork", but we need "key=value"
> syntax. "key" will become the name of created kprobe. Please try
> "__attribute__((section("func=do_fork"), used)) " instead.

> I think when event name is missing we'd better construct one name for
> it like perf probe, but then we need to deal with perf probe code
> again. It may require another patch.

Nah, lets go with what we have, i.e. I'll take that into account and
test with the expected form.

> For this patch, I think we can assign a new errorno so
> bpf__strerror_probe() can give more information to let user know
> whether the problem is reside in bpf program or perf configuration. Do
> you think ENOEXEC is a good choice?

Unsure if we should use existing errno codes in cases like this,
probably better to use a BPF_ERRNO__ENOALIAS or somesuch.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/