Re: [PATCH 32/32] bpf: Introduce function for outputing data to perf event

From: Wangnan (F)
Date: Fri Aug 28 2015 - 21:20:55 EST




On 2015/8/29 8:45, Alexei Starovoitov wrote:
On 8/28/15 12:06 AM, Wang Nan wrote:
his patch adds a new trace event to establish infrastruction for bpf to
output data to perf. Userspace perf tools can detect and use this event
as using the existing tracepoint events.

New bpf trace event entry in debugfs:

/sys/kernel/debug/tracing/events/bpf/bpf_output_data

Userspace perf tools detect the new tracepoint event as:

bpf:bpf_output_data [Tracepoint event]

Data in ring-buffer of perf events added to this event will be polled
out, sample types and other attributes can be adjusted to those events
directly without touching the original kprobe events.

Wang,
I have 2nd thoughts on this.
I've played with it, but global bpf:bpf_output_data event is limiting.
I'd like to use this bpf_output_trace_data() helper for tcp estats
gathering, but global collector will prevent other similar bpf programs
running in parallel.

So current model work for you but the problem is all output goes into one
place, which prevents similar BPF programs run in parallel because the
reveicer is unable to tell what message is generated by who. So actually
you want a publish-and-subscribe model, subscriber get messages from only
the publisher it interested in. Am I understand your problem correctly?

So as a concept I think it's very useful, but we need a way to select
which ring-buffer to output data to.
proposal A:
Can we use ftrace:instances concept and make bpf_output_trace_data()
into that particular trace_pipe ?
proposal B:
bpf_perf_event_read() model is using nice concept of an array of
perf_events. Can we perf_event_open a 'new' event that can be mmaped
in user space and bpf_output_trace_data(idx, buf, buf_size) into it.
Where 'idx' will be an index of FD from perf_even_open of such
new event?


I've also thinking about adding the extra id parameter in bpf_output_trace_data()
but it is for encoding the type of output data, which is totally different
from what you want.

For me, I use bpf_output_trace_data() to output information like PMU count
value. Perf is the only receiver, so global collector is perfect. Could you
please describe your usecase in more detail?

Thank you for using that feature!

Thanks!



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/