Re: [RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk
From: Alan Maguire
Date: Tue Aug 18 2020 - 05:15:26 EST
On Fri, 14 Aug 2020, Alexei Starovoitov wrote:
> On Fri, Aug 14, 2020 at 02:06:37PM +0100, Alan Maguire wrote:
> > On Wed, 12 Aug 2020, Alexei Starovoitov wrote:
> >
> > > On Thu, Aug 06, 2020 at 03:42:23PM +0100, Alan Maguire wrote:
> > > >
> > > > The bpf_trace_printk tracepoint is augmented with a "trace_id"
> > > > field; it is used to allow tracepoint filtering as typed display
> > > > information can easily be interspersed with other tracing data,
> > > > making it hard to read. Specifying a trace_id will allow users
> > > > to selectively trace data, eliminating noise.
> > >
> > > Since trace_id is not seen in trace_pipe, how do you expect users
> > > to filter by it?
> >
> > Sorry should have specified this. The approach is to use trace
> > instances and filtering such that we only see events associated
> > with a specific trace_id. There's no need for the trace event to
> > actually display the trace_id - it's still usable as a filter.
> > The steps involved are:
> >
> > 1. create a trace instance within which we can specify a fresh
> > set of trace event enablings, filters etc.
> >
> > mkdir /sys/kernel/debug/tracing/instances/traceid100
> >
> > 2. enable the filter for the specific trace id
> >
> > echo "trace_id == 100" >
> > /sys/kernel/debug/tracing/instances/traceid100/events/bpf_trace/bpf_trace_printk/filter
> >
> > 3. enable the trace event
> >
> > echo 1 >
> > /sys/kernel/debug/tracing/instances/events/bpf_trace/bpf_trace_printk/enable
> >
> > 4. ensure the BPF program uses a trace_id 100 when calling bpf_trace_btf()
>
> ouch.
> I think you interpreted the acceptance of the
> commit 7fb20f9e901e ("bpf, doc: Remove references to warning message when using bpf_trace_printk()")
> in the wrong way.
>
> Everything that doc had said is still valid. In particular:
> -A: This is done to nudge program authors into better interfaces when
> -programs need to pass data to user space. Like bpf_perf_event_output()
> -can be used to efficiently stream data via perf ring buffer.
> -BPF maps can be used for asynchronous data sharing between kernel
> -and user space. bpf_trace_printk() should only be used for debugging.
>
> bpf_trace_printk is for debugging only. _debugging of bpf programs themselves_.
> What you're describing above is logging and tracing. It's not debugging of programs.
> perf buffer, ring buffer, and seq_file interfaces are the right
> interfaces for tracing, logging, and kernel debugging.
>
> > > It also feels like workaround. May be let bpf prog print the whole
> > > struct in one go with multiple new lines and call
> > > trace_bpf_trace_printk(buf) once?
> >
> > We can do that absolutely, but I'd be interested to get your take
> > on the filtering mechanism before taking that approach. I'll add
> > a description of the above mechanism to the cover letter and
> > patch to be clearer next time too.
>
> I think patch 3 is no go, because it takes bpf_trace_printk in
> the wrong direction.
> Instead please refactor it to use string buffer or seq_file as an output.
Fair enough. I'm thinking a helper like
long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr,
u32 ptr_size, u64 flags);
Then the user can choose perf event or ringbuf interfaces
to share the results with userspace.
> If the user happen to use bpf_trace_printk("%s", buf);
> after that to print that string buffer to trace_pipe that's user's choice.
> I can see such use case when program author wants to debug
> their bpf program. That's fine. But for kernel debugging, on demand and
> "always on" logging and tracing the documentation should point
> to sustainable interfaces that don't interfere with each other,
> can be run in parallel by multiple users, etc.
>
The problem with bpf_trace_printk() under this approach is
that the string size for %s arguments is very limited;
bpf_trace_printk() restricts these to 64 bytes in size.
Looks like bpf_seq_printf() restricts a %s string to 128
bytes also. We could add an additional helper for the
bpf_seq case which calls bpf_seq_printf() for each component
in the object, i.e.
long bpf_seq_btf_printf(struct seq_file *m, struct btf_ptr *ptr,
u32 ptr_size, u64 flags);
This would steer users away from bpf_trace_printk()
for this use case - since it can print only a small
amount of the string - while supporting all
the other user-space communication mechanisms.
Alan