Re: perf bpf examples

From: Brendan Gregg
Date: Fri Jul 08 2016 - 12:37:31 EST


On Fri, Jul 8, 2016 at 3:46 AM, Wangnan (F) <wangnan0@xxxxxxxxxx> wrote:
>
>
> On 2016/7/8 15:57, Brendan Gregg wrote:
>>
[...]
>> I mean just an -F99 that executes a BPF program on each sample. My
>> most common use for perf is:
>>
>> perf record -F 99 -a -g -- sleep 30
>> perf report (or perf script, for making flame graphs)
>>
>> But this uses perf.data as an intermediate file. With the recent
>> BPF_MAP_TYPE_STACK_TRACE, we could frequency count stack traces in
>> kernel context, and just dump a report. Much more efficient. And
>> improving a very common perf one-liner.
>
>
> You can't attach BPF script to samples other than kprobe and tracepoints.
> When you use 'perf record -F99 -a -g -- sleep 30', you are sampling on
> 'cycles:ppp' event. This is a hardware PMU event.

Sure, either cycles:ppp or cpu-clock (my Xen guests have no PMU,
sadly). But These are ultimately calling perf_swevent_hrtimer()/etc,
so I was wondering if someone was already looking at enhancing this
code to support BPF? Ie, BPF should be able to attach to kprobes,
uprobes, tracepoints, and timer-based samples.

> If we find a kprobe or tracepoint event which would be triggered 99 times
> in each second, we can utilize BPF_MAP_TYPE_STACK_TRACE and
> bpf_get_stackid().

Yes, that should be a workaround. It's annoying as some like
perf_swevent_hrtimer() can't be kprobed (inlined?), but I found
perf_misc_flags(struct pt_regs *regs) was called, but passing in that
regs to bpf_get_stackid() was returning "type=inv expected=ctx"
errors, despite casting. I'm guessing the BPF ctx type is special and
can't be casted, but need to dig more.

Brendan