Re: [PATCH net-next 2/8] perf, bpf: allow bpf programs attach to tracepoints

From: Peter Zijlstra
Date: Tue Apr 05 2016 - 14:17:05 EST


On Tue, Apr 05, 2016 at 11:09:30AM -0700, Alexei Starovoitov wrote:
> >>@@ -67,6 +69,14 @@ perf_trace_##call(void *__data, proto) \
> >> \
> >> { assign; } \
> >> \
> >>+ if (prog) { \
> >>+ *(struct pt_regs **)entry = __regs; \
> >>+ if (!trace_call_bpf(prog, entry) || hlist_empty(head)) { \
> >>+ perf_swevent_put_recursion_context(rctx); \
> >>+ return; \
> >>+ } \
> >>+ memset(&entry->ent, 0, sizeof(entry->ent)); \
> >
> >But if not, you destroy it and then feed it to perf?
>
> yes. If bpf prog returns 1 the buffer goes into normal ring-buffer
> with all perf_event attributes and so on.
> So far there wasn't a single real use case where we went this path.
> Programs always do aggregation inside and pass stuff to user space
> either via bpf maps or via bpf_perf_event_output() helper.
> I wanted to keep perf_trace_xx() calls to be minimal in .text size
> so memset above is one x86 instruction, but I don't mind
> replacing this memset with a call to a helper function that will do:
> local_save_flags(flags);
> tracing_generic_entry_update(entry, flags, preempt_count());
> entry->type = type;
> Then whether bpf attached or not the ring buffer will see the same
> raw tracepoint entry. You think it's cleaner?

Yeah, otherwise you get very weird and surprising behaviour.

Also, one possible use-case is dynamic filters where the BPF program is
basically used to filter events, although I suppose we already have a
hook for that elsewhere.