Re: [PATCH v8 00/12] perf: enable compression of record mode trace to save storage space

From: Alexey Budankov
Date: Fri Mar 15 2019 - 09:43:33 EST


On 15.03.2019 15:28, Jiri Olsa wrote:
> On Thu, Mar 14, 2019 at 02:26:23PM +0300, Alexey Budankov wrote:
>>
<SNIP>
>> The patch set implements runtime trace compression (-z option) in
>> record mode and trace auto decompression in report and inject modes.
>> Streaming Zstd API [1] is used for compression and decompression of
>> data that come from kernel mmaped data buffers.
>>
<SNIP>
>> $ tools/perf/perf record -z -e cycles -- matrix.gcc
>> $ tools/perf/perf record --aio -z -e cycles -- matrix.gcc
>> $ tools/perf/perf record -z --mmap-flush 1024 -e cycles -- matrix.gcc
>> $ tools/perf/perf record --aio -z --mmap-flush 1K -e cycles -- matrix.gcc
>
> hi,
> I'm getting error with -z:
>
> [root@krava perf]# ./perf record -z ./perf bench sched messaging -l 10000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 18.775 [sec]
> [ perf record: Woken up 57 times to write data ]
> 0x5228 [0]: failed to process type: 81
> [ perf record: Captured and wrote 6.453 MB perf.data, compressed (original 21.486 MB, ratio is 3.340) ]

Reproduced locally. Investigating right now.

tools/perf/perf record -z tools/perf/perf bench sched messaging -l 10000
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

Total time: 8.799 [sec]
[ perf record: Woken up 35 times to write data ]
0x2e48 [0]: failed to process type: 81
[ perf record: Captured and wrote 3.142 MB perf.data, compressed (original 10.241 MB, ratio is 3.272) ]

However it is not observed in my tests on 8 cores Skylake.

tools/perf/perf record -z ../../matrix/linux/matrix.gcc
Addr of buf1 = 0x7f2eca1ab010
Offs of buf1 = 0x7f2eca1ab180
Addr of buf2 = 0x7f2ec81aa010
Offs of buf2 = 0x7f2ec81aa1c0
Addr of buf3 = 0x7f2ec61a9010
Offs of buf3 = 0x7f2ec61a9100
Addr of buf4 = 0x7f2ec41a8010
Offs of buf4 = 0x7f2ec41a8140
Threads #: 8 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 30.075 seconds
[ perf record: Woken up 127 times to write data ]
[ perf record: Captured and wrote 6.820 MB perf.data (953438 samples), compressed (original 36.372 MB, ratio is 5.344) ]

Thanks,
Alexey

>
>
> jirka
>