Hi Johannes,
Sure. zstd/8 cores/make -j32:
zsmalloc:
real 7m36.413s
user 38m0.481s
sys 7m19.108s
Zswap: 211028 kB
Zswapped: 925904 kB
zswpin 397851
zswpout 1625707
zswpwb 5126
zblock:
real 7m55.009s
user 39m23.147s
sys 7m44.004s
Zswap: 253068 kB
Zswapped: 919956 kB
zswpin 456843
zswpout 2058963
zswpwb 3921
So zstd results in nearly double the compression ratio, which in turn
cuts total execution time *almost in half*.
The numbers speak for themselves. Compression efficiency >>> allocator
speed, because compression efficiency ultimately drives the continuous
*rate* at which allocations need to occur. You're trying to optimize a
constant coefficient at the expense of a higher-order one, which is a
losing proposition.
Actually there's a slight bug in zblock code for 4K page case which caused storage inefficiency for small (== well compressed) memory blocks. With that one fixed, the results look a lot brighter for zblock:
1. zblock/zstd/8 cores/make -j32 bzImage
real 7m28.290s
user 37m27.055s
sys 7m18.629s
Zswap: 221516 kB
Zswapped: 904104 kB
zswpin 425424
zswpout 2011503
zswpwb 4111