Kernel compression benchmarks

From: Alex Xu (Hello71)
Date: Wed Jul 01 2020 - 10:38:03 EST


Hi all,

ZSTD compression patches have been sent in a number of times over the
past few years. Every time, someone asks for benchmarks. Every time,
someone is concerned about compression time. Sometimes, someone provides
benchmarks.

But, as far as I can tell, nobody considered the compression parameters,
which have a significant impact on compression time and ratio.

So, I did some benchmarks myself, including all the compression levels
for each compressor.

Results:

The results are attached as SVG graphs and CSV data.

Summary:

- compression level, predictably, has a huge impact on compression time.
- compression level has virtually no impact on decompression time for
lz4, zstd, and some effect on others. interestingly, xz decompresses
slightly faster at higher compression levels (perhaps cache-related).
- gzip compresses slightly faster than zstd at medium compression levels.
- bzip2 sucks: slow compression, very slow decompression, poor ratio.
- lzma decompresses slightly faster than xz, but is also slightly larger.
- xz is smallest but with very slow compression and decompression.
- lz4 decompresses fastest.
- zstd is a good balanced default.
- 7z is much faster than xz, even with wine overhead.

Files:

For the kernel, I did "make allmodconfig; sed -i -e '/=m$/d' .config"
with a 5.6 kernel and gcc 9.3.0 on x86_64, then concatenated vmlinux.bin
and vmlinux.relocs. For the initramfs, I used the Arch Linux fallback
initramfs with default hooks.

Versions:

gzip 1.10
bzip2, a block-sorting file compressor. Version 1.0.8, 13-Jul-2019.
xz (XZ Utils) 5.2.5
*** LZ4 command line interface 64-bits v1.9.2, by Yann Collet ***
lzop 1.04
LZO library 2.10
*** zstd command line interface 64-bits v1.4.4, by Yann Collet ***
7-Zip 19.00 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Notes:

I used the userspace versions of the decompressors, not the kernel
version. This is particularly relevant for xz, as the kernel xzminidec
is significantly slower than xz.

pigz is faster than gzip, but I used gzip as a common baseline.

7-Zip was run through wine with a persistent wineserver.

I ran the benchmark on a Ryzen 1600, with turbo boost turned off. Each
test was run only once, on the basis that any noise wouldn't disrupt the
overall curve, and also I don't want to spend hours waiting for the
results.

The current compression level defaults are:

- gzip -9
- bzip2 -9
- lzma -9
- xz --check=crc32 --x86 --lzma2=,dict=32MiB # except on ppc
- lzop -9
- lz4 -l -1

My conclusions:

- zstd is an improvement on almost all metrics.
- bzip2 and lzma should be removed post-haste.
- lzo should be removed once zstd is merged.
- compression level is important to consider for compression speed: the
default lz4 -1 compresses very fast but has a very poor compression
ratio. zstd -19 compresses barely better than zstd -18, but takes
significantly longer to compress.
- compression level should be configurable: lz4 -1 is useful, but so is
lz4 -9. zstd -1 is useful, but so is zstd -19. zstd -1 is useful for
developers who want kernel builds as fast as possible, zstd -19 for
everybody else.
- gzip is by far not the fastest compressor (even excluding cat)
- modern compressors (xz, lz4, zstd) decompress about as fast for each
compression level, only requiring more memory
- 7-Zip is much faster than xz, needs more research
- 7-Zip BCJ2 is slightly better than xz/BCJ. probably better filters for
all archs would be a good area of research, as apparently BCJ/BCJ2 are
intended only for 32-bit x86.

Thanks,
Alex.

Attachment: kernel-compression-benchmarks.tar.gz
Description: application/compressed-tar