[RFC PATCH v3 00/37] perf tools: introduce 'perf bpf' command to load eBPF programs.

From: Wang Nan
Date: Sun May 17 2015 - 06:58:11 EST


This is the 3rd version of 'perf bpf' patch series, based on
v4.1-rc3.

The goal of this series of patches is to integrate eBPF with perf.
After applying these patches, users are allowed to use following
command to load eBPF program compiled by LLVM into kernel then start
recording with filters on:

# perf bpf record --object sample_bpf.o -- -a sleep 4

I post new version before receiving enough comment because I found that
v2 series losts a small but importand patch (37/37 in this series). It
does the final work, attaches eBPF programs to perf event.

Other than the previous change, v3 patch series drops the '|' event
syntax introduced in v2, because I realized that in v2 users are
allowed to pass any bpf fd by using it, like:

# perf bpf record -- -e sched:sched_switch|100| sleep 1

which may become trouble maker. In v3, patch 36/37 passes file
descriptors to evsel by bpf_for_each_program(), which iterates over
each bpf programs and calls a callback function. bpf_fd is set by the
callback.

According to Ingo's suggestion, I renamed every titles of all patches in
this series to make shortlog easier to read.

Since I haven't received enough comment, question in v2 is still open:

Are we actually need a 'perf bpf' command? We can get similar result by
modifying 'perf record' to make it load eBPF program before recording.

I suggest to keep 'perf bpf', group all eBPF stuffs together using a
uniform entry. Also, eBPF programs can act not only as filters but also
data aggregator. It is possible to make something link 'perf bpf run'
to simply make it run, and dump result after user hit 'C-c' or timeout.
The 'config' section may be utilized in this case to let 'perf bpf'
know how to display results.

Following is detail description. Most of following text is copied from
cover letter of v2. You can skip reading if you have already read this
in v2 series.

Patch 1/37 - 5/37 are preparation and bugfixs. Some of them are already
acked.

Patch 6/37 - 26/37 creates tools/lib/bpf.

Libbpf will be compiled into libbpf.a and libbpf.so. It can be
devided into 2 parts:

1) User-kernel interface. The API is defined by tools/lib/bpf/bpf.h,
encapsulates map and program loading operations. In
bpf_load_program(), it doesn't use log buffer in the first try to
improve performance, and retry with log buffer enabled when
failure.

2) ELF operations. The structure of eBPF object file is defined
here. API of this part can be found in tools/lib/bpf/libbpf.h.
'struct bpf_map_def' is also put here.

Libbpf's API hides internal structures. Callers access data of
object files with handlers and accessors. 'struct bpf_object *'
is handler of a whole object file. 'struct bpf_prog_handler *'
is handler and iterator of programs. Some of accessors are
defined to enable caller to retrive section name and file
descriptor of a program. Further accessor can be appended.

In the design of libbpf, I explictly separate full procedure
into opening and loading phase. Data are collected during
'opening' phase. BPF syscalls are called in 'loading' phase.
The separation is designed for potential cross-objects
operations. Such separation also give caller a chance to let
him/her to adjust bytecode and/or maps before real loading.
(API of such operation is not provided in this version).

During loading, fields in 'struct bpf_map_def' are also swapped
if endianess mismatched.

Patch 27/37 - 37/37 are patches on perf, which introduce 'perf bpf'
command and 'perf bpf record' subcommand.

'perf bpf' is not a standalone command. The usage should be:

perf bpf [<options>] <command> --objects <objfile> -- \
<args passed to other cmds>

First two patches make 'perf bpf' avaliable and make perf depend on
libbpf. 29/37 creates 'perf bpf record' and directly passes
everything after '--' to cmd_record(). Other stuffs resides in
tools/perf/utils/bpf-loader.[ch], which are introduced in 30/37.
Following patches do collection -> probing -> loading works step
by step. In those operations, 'perf bpf' collects all required
objects before creating kprobe points, and loads programs into
kernel after probing finish.

A 'bpf_unload()' is used to remove kprobe points. I use 'atexit'
hook to ensure it called before exiting. However, I find that
atexit hookers are not always work well. For example, when program
is canceled by SIGINT. Therefore we still need to call bpf_unload()
after cmd_record().

Patch 36/37 adds bpf_fd field to evsel and config it.

Patch 37/37 finally attachs eBPF program to perf event.

Wang Nan (37):
perf/events/core: fix race in bpf program unregister
perf tools: Set vmlinux_path__nr_entries to 0 in vmlinux_path__exit
tools lib traceevent: Install libtraceevent.a into libdir
tools: Change FEATURE_TESTS and FEATURE_DISPLAY to weak binding
tools: Add __aligned_u64 to types.h
bpf tools: Introduce 'bpf' library to tools
bpf tools: Allow caller to set printing function
bpf tools: Define basic interface
bpf tools: Open eBPF object file and do basic validation
bpf tools: Check endianess and set swap flag according to EHDR
bpf tools: Iterate over ELF sections to collect information
bpf tools: Collect version and license from ELF sections
bpf tools: Collect map definitions from 'maps' section
bpf tools: Collect config string from 'config' section
bpf tools: Collect symbol table from SHT_SYMTAB section
bpf tools: Collect eBPF programs from their own sections
bpf tools: Collect relocation sections from SHT_REL sections
bpf tools: Record map accessing instructions for each program
bpf tools: Clear libelf and ELF parsing resrouce to finish opening
bpf tools: Add bpf.c/h for common bpf operations
bpf tools: Create eBPF maps defined in an object file
bpf tools: Relocate eBPF programs
bpf tools: Introduce bpf_load_program() to bpf.c
bpf tools: Load eBPF programs in object files into kernel
bpf tools: Introduce accessors for struct bpf_program
bpf tools: Introduce accessors for struct bpf_object
perf tools: Add new 'perf bpf' command
perf tools: Make perf depend on libbpf
perf bpf: Add 'perf bpf record' subcommand
perf bpf: Add bpf-loader and open ELF object files
perf bpf: Collect all eBPF programs
perf bpf: Parse probe points of eBPF programs during preparation
perf bpf: Probe at kprobe points
perf bpf: Load all eBPF object into kernel
perf tools: Add a bpf_wrapper global flag
perf tools: Add bpf_fd field to evsel and config it
perf tools: Attach eBPF program to perf event

kernel/events/core.c | 3 +-
tools/build/Makefile.feature | 4 +-
tools/include/linux/types.h | 5 +
tools/lib/bpf/.gitignore | 2 +
tools/lib/bpf/Build | 1 +
tools/lib/bpf/Makefile | 191 ++++++
tools/lib/bpf/bpf.c | 87 +++
tools/lib/bpf/bpf.h | 24 +
tools/lib/bpf/libbpf.c | 1089 +++++++++++++++++++++++++++++++++
tools/lib/bpf/libbpf.h | 66 ++
tools/lib/traceevent/Makefile | 20 +-
tools/perf/Build | 1 +
tools/perf/Documentation/perf-bpf.txt | 18 +
tools/perf/Makefile.perf | 20 +-
tools/perf/builtin-bpf.c | 217 +++++++
tools/perf/builtin-record.c | 6 +
tools/perf/builtin.h | 1 +
tools/perf/command-list.txt | 1 +
tools/perf/perf.c | 10 +
tools/perf/perf.h | 1 +
tools/perf/util/Build | 1 +
tools/perf/util/bpf-loader.c | 262 ++++++++
tools/perf/util/bpf-loader.h | 34 +
tools/perf/util/debug.c | 5 +
tools/perf/util/debug.h | 1 +
tools/perf/util/evlist.c | 34 +
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 17 +
tools/perf/util/evsel.h | 1 +
tools/perf/util/parse-options.c | 8 +-
tools/perf/util/parse-options.h | 2 +
tools/perf/util/symbol.c | 1 +
32 files changed, 2122 insertions(+), 12 deletions(-)
create mode 100644 tools/lib/bpf/.gitignore
create mode 100644 tools/lib/bpf/Build
create mode 100644 tools/lib/bpf/Makefile
create mode 100644 tools/lib/bpf/bpf.c
create mode 100644 tools/lib/bpf/bpf.h
create mode 100644 tools/lib/bpf/libbpf.c
create mode 100644 tools/lib/bpf/libbpf.h
create mode 100644 tools/perf/Documentation/perf-bpf.txt
create mode 100644 tools/perf/builtin-bpf.c
create mode 100644 tools/perf/util/bpf-loader.c
create mode 100644 tools/perf/util/bpf-loader.h

--
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/