Re: perf bpf examples

From: Brendan Gregg
Date: Fri Jul 08 2016 - 03:58:32 EST


On Thu, Jul 7, 2016 at 9:18 PM, Wangnan (F) <wangnan0@xxxxxxxxxx> wrote:
>
>
> On 2016/7/8 1:58, Brendan Gregg wrote:
>>
>> On Thu, Jul 7, 2016 at 10:54 AM, Brendan Gregg
>> <brendan.d.gregg@xxxxxxxxx> wrote:
>>>
>>> On Wed, Jul 6, 2016 at 6:49 PM, Wangnan (F) <wangnan0@xxxxxxxxxx> wrote:
[...]
>> ... Also, has anyone looked into perf sampling (-F 99) with bpf yet?
>> Thanks,
>
>
> Theoretically, BPF program is an additional filter to
> decide whetier an event should be filtered out or pass to perf. -F 99
> is another filter, which drops samples to ensure the frequence.
> Filters works together. The full graph should be:
>
> BPF --> traditional filter --> proc (system wide of proc specific) -->
> period
>
> See the example at the end of this mail. The BPF program returns 0 for half
> of
> the events, and the result should be symmetrical. We can get similar result
> without
> -F:
>
> # ~/perf record -a --clang-opt '-DCATCH_ODD' -e ./sampling.c dd if=/dev/zero
> of=/dev/null count=8388480
> 8388480+0 records in
> 8388480+0 records out
> 4294901760 bytes (4.3 GB) copied, 11.9908 s, 358 MB/s
> [ perf record: Woken up 28 times to write data ]
> [ perf record: Captured and wrote 303.915 MB perf.data (4194449 samples) ]
> #
> root@wn-Lenovo-Product:~# ~/perf record -a --clang-opt '-DCATCH_EVEN' -e
> ./sampling.c dd if=/dev/zero of=/dev/null count=8388480
> 8388480+0 records in
> 8388480+0 records out
> 4294901760 bytes (4.3 GB) copied, 12.1154 s, 355 MB/s
> [ perf record: Woken up 54 times to write data ]
> [ perf record: Captured and wrote 303.933 MB perf.data (4194347 samples) ]
>
>
> With -F99 added:
>
> # ~/perf record -F99 -a --clang-opt '-DCATCH_ODD' -e ./sampling.c dd
> if=/dev/zero of=/dev/null count=8388480
> 8388480+0 records in
> 8388480+0 records out
> 4294901760 bytes (4.3 GB) copied, 9.60126 s, 447 MB/s
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.402 MB perf.data (35 samples) ]
> # ~/perf record -F99 -a --clang-opt '-DCATCH_EVEN' -e ./sampling.c dd
> if=/dev/zero of=/dev/null count=8388480
> 8388480+0 records in
> 8388480+0 records out
> 4294901760 bytes (4.3 GB) copied, 9.76719 s, 440 MB/s
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.399 MB perf.data (37 samples) ]

That looks like it's doing two different things: -F99, and a
sampling.c script (SEC("func=sys_read")).

I mean just an -F99 that executes a BPF program on each sample. My
most common use for perf is:

perf record -F 99 -a -g -- sleep 30
perf report (or perf script, for making flame graphs)

But this uses perf.data as an intermediate file. With the recent
BPF_MAP_TYPE_STACK_TRACE, we could frequency count stack traces in
kernel context, and just dump a report. Much more efficient. And
improving a very common perf one-liner.

Brendan