[RFC PATCH v4 0/3] Make eBPF programs output data to perf event

From: He Kuang
Date: Fri Jul 10 2015 - 06:04:16 EST


Hi,

Previous discussion url(patch v3):
http://thread.gmane.org/gmane.linux.kernel/1990197/focus=1991022

We found that creating new trace event for bpf subsystem is more
simple than adding new ftrace:bpf entry, the only thing we should do
for outputing bpf sample events is to call a predefined trace event
function. Then it is easy to extend new output formats and not
restricted to one fixed format like ftrace:function.

By using trace events, we can also benifit from the dynamic array
field type for outputing results by number of items we filled in, and
achieve the purpose of standardization of output format. Function for
getting number of items in dynamic array is added to libtraceevent and
a improper result of macro __get_dynamic_array_len is corrected.

eBPF sample code, tests outputing multiple items:

SEC("generic_perform_write=generic_perform_write")
int NODE_generic_perform_write(struct pt_regs *ctx)
{
char fmt[] = "generic_perform_write last=%lld, cur=%lld, del=%lld\n";
u64 cur_time, del_time, result[3] = {0};
int ind =0;
struct time_table *last = bpf_map_lookup_elem(&global_time_table, &ind);
struct time_table output;

if (!last)
return 0;

cur_time = bpf_ktime_get_ns();

if (!last->last_time)
del_time = 0;
else
del_time = cur_time - last->last_time;

/* For debug */
bpf_trace_printk(fmt, sizeof(fmt), last->last_time, cur_time, del_time);
result[0] = last->last_time;

/* Table update */
output.last_time = cur_time;
bpf_map_update_elem(&global_time_table, &ind, &output, BPF_ANY);

/* This is a casual condition to show the funciton */
if (del_time < 1000)
return 0;

result[1] = cur_time;
result[2] = del_time;
bpf_output_trace_data(result, sizeof(result));

return 0;
}

Record bpf events:

$ perf record -e bpf:bpf_output_data -e sample.o --
dd if=/dev/zero of=test bs=4k count=3

Results in /sys/kernel/debug/tracing/trace:

dd-984 [000] d... 60.894097: : generic_perform_write
last=60560862578, cur=60654629075, del=93766497
dd-984 [000] d... 60.896957: : generic_perform_write
last=60654629075, cur=60657510709, del=2881634
dd-984 [000] d... 60.897276: : generic_perform_write
last=60657510709, cur=60657829953, del=319244

Results showed in perf-script:

dd 984 [000] 60.655211: bpf:bpf_output_data: 60560862578 60654629075 93766497
dd 984 [000] 60.657552: bpf:bpf_output_data: 60654629075 60657510709 2881634
dd 984 [000] 60.657898: bpf:bpf_output_data: 60657510709 60657829953 319244

Thank you.

He Kuang (3):
tracing/events: Fix wrong sample output by storing array length
instead of size
tools lib traceevent: Add function to get dynamic arrays length
bpf: Introduce function for outputing data to perf event

include/trace/events/bpf.h | 30 +++++++++++++
include/trace/trace_events.h | 5 ++-
include/uapi/linux/bpf.h | 7 +++
kernel/trace/bpf_trace.c | 23 ++++++++++
samples/bpf/bpf_helpers.h | 2 +
tools/lib/traceevent/event-parse.c | 52 ++++++++++++++++++++++
tools/lib/traceevent/event-parse.h | 1 +
.../util/scripting-engines/trace-event-python.c | 1 +
8 files changed, 119 insertions(+), 2 deletions(-)
create mode 100644 include/trace/events/bpf.h

--
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/