Re: [PATCH v3 perf, bpf-next 1/4] perf, bpf: Introduce PERF_RECORD_BPF_EVENT

From: Arnaldo Carvalho de Melo
Date: Wed Dec 12 2018 - 08:27:52 EST


Em Wed, Dec 12, 2018 at 02:15:49PM +0100, Peter Zijlstra escreveu:
> On Tue, Dec 11, 2018 at 03:33:47PM -0800, Song Liu wrote:
> > For better performance analysis of BPF programs, this patch introduces
> > PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
> > load/unload information to user space.
> >
> > Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs.
> > The following example shows kernel symbols for a BPF program with 7
> > sub programs:
> >
> > ffffffffa0257cf9 t bpf_prog_b07ccb89267cf242_F
> > ffffffffa02592e1 t bpf_prog_2dcecc18072623fc_F
> > ffffffffa025b0e9 t bpf_prog_bb7a405ebaec5d5c_F
> > ffffffffa025dd2c t bpf_prog_a7540d4a39ec1fc7_F
> > ffffffffa025fcca t bpf_prog_05762d4ade0e3737_F
> > ffffffffa026108f t bpf_prog_db4bd11e35df90d4_F
> > ffffffffa0263f00 t bpf_prog_89d64e4abf0f0126_F
> > ffffffffa0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi
>
> Doesn't BPF have enough information to generate 'saner' names? Going by
> the thing below, these sub-progs are actually functions, right?

Yeah, this looks just like a symbol table, albeit just with functions,
so far.

> > /*
> > * Record different types of bpf events:
> > * enum perf_bpf_event_type {
> > * PERF_BPF_EVENT_UNKNOWN = 0,
> > * PERF_BPF_EVENT_PROG_LOAD = 1,
> > * PERF_BPF_EVENT_PROG_UNLOAD = 2,
> > * };
> > *
> > * struct {
> > * struct perf_event_header header;
> > * u32 type;
> > * u32 flags;
> > * u32 id; // prog_id or other id
> > * u32 sub_id; // subprog id
> > *
> > * // for bpf_prog types, bpf prog or subprog
> > * u8 tag[BPF_TAG_SIZE];
> > * u64 addr;
> > * u64 len;
> > * char name[];
> > * struct sample_id sample_id;
> > * };
> > */
>
> Isn't this mixing two different things (poorly)? The kallsym update and
> the BPF load/unload ?
>
> And while this tracks the bpf kallsyms, it does not do all kallsyms.
>
> .... Oooh, I see the problem, everybody is doing their own custom
> kallsym_{add,del}() thing, instead of having that in generic code :-(

> This, for example, doesn't track module load/unload nor ftrace
> trampolines, even though both affect kallsyms.

So you think the best would have to be a PERF_RECORD_ that just states
that code has been loaded at range (addr, len). I.e. much like
PERF_RECORD_MMAP does, just for userspace? Then it could be used by BPF
and any other kernel facility like the ones you described?

There would be an area that would be used by each of these facilities to
figure out further info, like we use the mmap->name to go and get the
symbol table from ELF files, etc, but BPF could use for their specific
stuff?

The above is almost like PERF_RECORD_MMAP (PERF_BPF_EVENT_PROG_LOAD) +
PERF_RECORD_MUNMAP(PERF_BPF_EVENT_PROG_UNLOAD) in one event, with this
'type' thing allowing for more stuff to be put in later, I guess.

- Arnaldo

> > +void perf_event_bpf_event_prog(enum perf_bpf_event_type type,
> > + struct bpf_prog *prog)
> > +{
> > + if (!atomic_read(&nr_bpf_events))
> > + return;
> > +
> > + if (type != PERF_BPF_EVENT_PROG_LOAD &&
> > + type != PERF_BPF_EVENT_PROG_UNLOAD)
> > + return;
> > +
> > + if (prog->aux->func_cnt == 0) {
> > + perf_event_bpf_event_subprog(type, prog,
> > + prog->aux->id, 0);
> > + } else {
> > + int i;
> > +
> > + for (i = 0; i < prog->aux->func_cnt; i++)
> > + perf_event_bpf_event_subprog(type, prog->aux->func[i],
> > + prog->aux->id, i);
> > + }
> > +}
>

--

- Arnaldo