Re: [PATCHv4 02/17] zram: do not use per-CPU compression streams
From: Sergey Senozhatsky
Date: Sun Feb 02 2025 - 22:49:59 EST
On (25/02/01 17:21), Kairui Song wrote:
> This seems will cause a huge regression of performance on multi core
> systems, this is especially significant as the number of concurrent
> tasks increases:
>
> Test build linux kernel using ZRAM as SWAP (1G memcg):
>
> Before:
> + /usr/bin/time make -s -j48
> 2495.77user 2604.77system 2:12.95elapsed 3836%CPU (0avgtext+0avgdata
> 863304maxresident)k
>
> After:
> + /usr/bin/time make -s -j48
> 2403.60user 6676.09system 3:38.22elapsed 4160%CPU (0avgtext+0avgdata
> 863276maxresident)k
How many CPUs do you have? I assume, preemption gets into way which is
sort of expected, to be honest... Using per-CPU compression streams
disables preemption and uses CPU exclusively at a price of other tasks
not being able to run. I do tend to think that I made a mistake by
switching zram to per-CPU compression streams.
What preemption model do you use and to what extent do you overload
your system?
My tests don't show anything unusual (but I don't overload the system)
CONFIG_PREEMPT
before
1371.96user 156.21system 1:30.91elapsed 1680%CPU (0avgtext+0avgdata 825636maxresident)k
32688inputs+1768416outputs (259major+51539861minor)pagefaults 0swaps
after
1372.05user 155.79system 1:30.82elapsed 1682%CPU (0avgtext+0avgdata 825684maxresident)k
32680inputs+1768416outputs (273major+51541815minor)pagefaults 0swaps
(I use zram as a block device with ext4 on it.)
> `perf lock contention -ab sleep 3` also indicates the big spin lock in
> zcomp_stream_get/put is having significant contention:
Hmm it's just
spin_lock()
list first entry
spin_unlock()
Shouldn't be "a big spin lock", that's very odd. I'm not familiar with
perf lock contention, let me take a look.