Re: [PATCH net-next 0/8] allow bpf attach to tracepoints

From: Alexei Starovoitov
Date: Mon Apr 18 2016 - 15:52:56 EST


On 4/18/16 9:13 AM, Steven Rostedt wrote:
On Mon, 4 Apr 2016 21:52:46 -0700
Alexei Starovoitov <ast@xxxxxx> wrote:

Hi Steven, Peter,

last time we discussed bpf+tracepoints it was a year ago [1] and the reason
we didn't proceed with that approach was that bpf would make arguments
arg1, arg2 to trace_xx(arg1, arg2) call to be exposed to bpf program
and that was considered unnecessary extension of abi. Back then I wanted
to avoid the cost of buffer alloc and field assign part in all
of the tracepoints, but looks like when optimized the cost is acceptable.
So this new apporach doesn't expose any new abi to bpf program.
The program is looking at tracepoint fields after they were copied
by perf_trace_xx() and described in /sys/kernel/debug/tracing/events/xxx/format

Does this mean that ftrace could use this ability as well? As all the
current filtering of ftrace was done after it was copied to the buffer,
and that was what you wanted to avoid.

yeah, it could be added to ftrace as well, but it won't be as effective
as perf_trace, since the cost of trace_event_buffer_reserve() in
trace_event_raw_event_() handler is significantly higher than perf_trace_buf_alloc() in perf_trace_().
Then from the program point of view it wouldn't care how that memory
is allocated, so the user tools will just use perf_trace_() style.
The only use case I see for bpf with ftrace's tracepoint handler
is to work as an actual filter, but we already have filters there...
so not clear to me of the actual value of adding bpf to ftrace's
tracepoint handler.
At the same time there is whole ftrace as function tracer. That is
very lucrative field for bpf to plug into ;)

As far as 2nd part of your question about copying. Yeah, it adds to
the cost, so kprobe sometimes is faster than perf_trace tracepoint
that is copying a lot of args which are not going to be used.