Re: [PATCH RFC 0/4] Add support to directly attach BPF program to ftrace

From: Alexei Starovoitov
Date: Tue Jul 16 2019 - 21:30:49 EST


On Tue, Jul 16, 2019 at 06:31:17PM -0400, Steven Rostedt wrote:
> On Tue, 16 Jul 2019 17:30:50 -0400
> Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> > I don't see why a new bpf node for a trace event is a bad idea, really.
> > tracefs is how we deal with trace events on Android. We do it in production
> > systems. This is a natural extension to that and fits with the security model
> > well.
>
> What I would like to see is a way to have BPF inject data into the
> ftrace ring buffer directly. There's a bpf_trace_printk() that I find a
> bit of a hack (especially since it hooks into trace_printk() which is
> only for debugging purposes). Have a dedicated bpf ftrace ring
> buffer event that can be triggered is what I am looking for. Then comes
> the issue of what ring buffer to place it in, as ftrace can have
> multiple ring buffer instances. But these instances are defined by the
> tracefs instances directory. Having a way to associate a bpf program to
> a specific event in a specific tracefs directory could allow for ways to
> trigger writing into the correct ftrace buffer.

bpf can write anything (including full skb) into perf ring buffer.
I don't think there is a use case yet to write into ftrace ring buffer.
But I can be convinced otherwise :)

> But looking over the patches, I see what Alexei means that there's no
> overlap with ftrace and these patches except for the tracefs directory
> itself (which is part of the ftrace infrastructure). And the trace
> events are technically part of the ftrace infrastructure too. I see the
> tracefs interface being used, but I don't see how the bpf programs
> being added affect the ftrace ring buffer or other parts of ftrace. And
> I'm guessing that's what is confusing Alexei.

yep.
What I really like to see some day is proper integration of real ftrace
and bpf. So that bpf progs can be invoked from some of the ftrace machinery.
And the other way around.
Like I'd love to be able to start ftracing all functions from bpf
and stop it from bpf.
The use case is kernel debugging. I'd like to examine a bunch of condition
and start ftracing the execution. Then later decide wether this collection
of ip addresses is interesting to analyze and post process it quickly
while still inside bpf program.