Re: [PATCH v2 0/4] perf: enable compression of record mode trace to save storage space

From: Alexey Budankov
Date: Tue Jan 29 2019 - 11:39:27 EST


On 29.01.2019 15:13, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jan 29, 2019 at 02:39:00PM +0300, Alexey Budankov escreveu:
>> Hi,
>> On 29.01.2019 13:53, Arnaldo Carvalho de Melo wrote:
>>> Em Tue, Jan 29, 2019 at 11:45:43AM +0100, Arnaldo Carvalho de Melo escreveu:
>>>> Em Mon, Jan 28, 2019 at 09:40:28AM +0300, Alexey Budankov escreveu:
>>>>> The patch set implements runtime trace compression for record mode and
>>>>> trace file decompression for report mode. Zstandard API [1] is used for
>>>>> compression/decompression of data that come from perf_events kernel
>>>>
>>>> Interesting, wasn't aware of this zstd library, I wonder if we can add
>>>> it and switch the other compression libraries we link against, so that
>>>> we're not adding one more library to the dep list of perf but removing
>>>> some instead, do you think this would be possible?
>>
>> Replacing of incorporated compression APIs was not evaluated or tested in
>> the scope of this patch set work. However according to their numbers in the
>> docs and the numbers that we have got during testing Zstd API outperforms
>> the exiting compression libraries as in terms of speed as in terms of
>> compression ratio (at least libz). Backward compatibility needs to be taken
>> into account so that old perf files would open by newer perf tool versions.
>
> Right, I'm not talking in the scope of this patch, its just that while
> looking at it, I notice that we're adding yet another compression
> library and its description seemed to imply it would support the other
> compression formats, which I've learned its not the case, so nevermind.
>
> I'm not talking about using just zstd, as what we mostly do with the
> compression libraries is to decompress, not compress, for instance, we
> need to uncompress kernel modules to get to its symbols, do annotation
> with it, etc.

Well, yes, having single compression/decompression API implementation that
would handle different compression formats and scenarios sounds reasonable
from support cost perspective.

It looks like Zstandard library could provide such capabilities.
The library is well supported at the moment so fixes and extensions are
released timely enough.

-Alexey

>
> - Arnaldo
>
>> -Alexey
>>
>>>>
>>>> $ ldd ~/bin/perf | wc -l
>>>> 30
>>>> $ ldd ~/bin/perf | grep z
>>>> liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f3dcc356000)
>>>> libz.so.1 => /lib64/libz.so.1 (0x00007f3dcb2aa000)
>>>> libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f3dcb218000)
>>>> $
>>>>
>>>> Humm, from the github page it says:
>>>>
>>>> -----
>>>> The project is provided as an open-source dual BSD and GPLv2 licensed C
>>>> library, and a command line utility producing and decoding .zst, .gz,
>>>> .xz and .lz4 files. Should your project require another programming
>>>> language, a list of known ports and bindings is provided on Zstandard
>>>> homepage.
>>>> -----
>>>>
>>>> So it would cover just liblzma and libz, right?
>>>
>>> Nevermind;
>>>
>>> [acme@quaco perf]$ zstdcat ~/git/perf/perf-5.0.0-rc2.tar.xz
>>> zstd: /home/acme/git/perf/perf-5.0.0-rc2.tar.xz: xz/lzma file cannot be uncompressed (zstd compiled without HAVE_LZMA) -- ignored
>>>
>>> So it handles those formats, _if_ linked with those libraries, duh.
>>>
>>> - Arnaldo
>>>
>