[PATCHSET v3 0/3] perf stat: Enable BPF counters with --for-each-cgroup

From: Namhyung Kim
Date: Tue Jun 22 2021 - 03:12:31 EST


Hello,

This is to add BPF support for --for-each-cgroup to handle many cgroup
events on big machines. You can use the --bpf-counters to enable the
new behavior.

* changes in v3
- support cgroup hierarchy with ancestor ids
- add and trigger raw_tp BPF program
- add a build rule for vmlinux.h

* changes in v2
- remove incorrect use of BPF_F_PRESERVE_ELEMS
- add missing map elements after lookup
- handle cgroup v1

Basic idea is to use a single set of per-cpu events to count
interested events and aggregate them to each cgroup. I used bperf
mechanism to use a BPF program for cgroup-switches and save the
results in a matching map element for given cgroups.

Without this, we need to have separate events for cgroups, and it
creates unnecessary multiplexing overhead (and PMU programming) when
tasks in different cgroups are switched. I saw this makes a big
difference on 256 cpu machines with hundreds of cgroups.

Actually this is what I wanted to do it in the kernel [1], but we can
do the job using BPF!


Thanks,
Namhyung


[1] https://lore.kernel.org/lkml/20210413155337.644993-1-namhyung@xxxxxxxxxx/


Namhyung Kim (3):
perf tools: Add read_cgroup_id() function
perf tools: Add cgroup_is_v2() helper
perf stat: Enable BPF counter with --for-each-cgroup

tools/perf/Makefile.perf | 7 +-
tools/perf/util/Build | 1 +
tools/perf/util/bpf_counter.c | 5 +
tools/perf/util/bpf_counter_cgroup.c | 337 ++++++++++++++++++++
tools/perf/util/bpf_skel/bperf_cgroup.bpf.c | 207 ++++++++++++
tools/perf/util/cgroup.c | 46 +++
tools/perf/util/cgroup.h | 12 +
7 files changed, 614 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/bpf_counter_cgroup.c
create mode 100644 tools/perf/util/bpf_skel/bperf_cgroup.bpf.c

--
2.32.0.288.g62a8d224e6-goog