Re: [PATCH 5/6] mm/zswap: only support zswap_exclusive_loads_enabled

From: Chengming Zhou
Date: Fri Feb 02 2024 - 07:58:08 EST


On 2024/2/2 02:12, Johannes Weiner wrote:
> On Thu, Feb 01, 2024 at 03:49:05PM +0000, Chengming Zhou wrote:
>> The !zswap_exclusive_loads_enabled mode will leave compressed copy in
>> the zswap tree and lru list after the folio swapin.
>>
>> There are some disadvantages in this mode:
>> 1. It's a waste of memory since there are two copies of data, one is
>> folio, the other one is compressed data in zswap. And it's unlikely
>> the compressed data is useful in the near future.
>>
>> 2. If that folio is dirtied, the compressed data must be not useful,
>> but we don't know and don't invalidate the trashy memory in zswap.
>>
>> 3. It's not reclaimable from zswap shrinker since zswap_writeback_entry()
>> will always return -EEXIST and terminate the shrinking process.
>>
>> On the other hand, the only downside of zswap_exclusive_loads_enabled
>> is a little more cpu usage/latency when compression, and the same if
>> the folio is removed from swapcache or dirtied.
>>
>> Not sure if we should accept the above disadvantages in the case of
>> !zswap_exclusive_loads_enabled, so send this out for disscusion.
>>
>> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>
> This is interesting.
>
> First, I will say that I never liked this config option, because it's
> nearly impossible for a user to answer this question. Much better to
> just pick a reasonable default.

Agree.

>
> What should the default be?
>
> Caching "swapout work" is helpful when the system is thrashing. Then
> recently swapped in pages might get swapped out again very soon. It
> certainly makes sense with conventional swap, because keeping a clean
> copy on the disk saves IO work and doesn't cost any additional memory.
>
> But with zswap, it's different. It saves some compression work on a
> thrashing page. But the act of keeping compressed memory contributes
> to a higher rate of thrashing. And that can cause IO in other places
> like zswap writeback and file memory.
>
> It would be useful to have an A/B test to confirm that not caching is
> better. Can you run your test with and without keeping the cache, and
> in addition to the timings also compare the deltas for pgscan_anon,
> pgscan_file, workingset_refault_anon, workingset_refault_file?

I just A/B test kernel building in tmpfs directory, memory.max=2GB.
(zswap writeback enabled and shrinker_enabled, one 50GB swapfile)

>From the below results, exclusive mode has fewer scan and refault.

zswap-invalidate-entry zswap-invalidate-entry-exclusive
real 63.80 63.01
user 1063.83 1061.32
sys 290.31 266.15
zswap-invalidate-entry zswap-invalidate-entry-exclusive
workingset_refault_anon 2383084.40 1976397.40
workingset_refault_file 44134.00 45689.40
workingset_activate_anon 837878.00 728441.20
workingset_activate_file 4710.00 4085.20
workingset_restore_anon 732622.60 639428.40
workingset_restore_file 1007.00 926.80
workingset_nodereclaim 0.00 0.00
pgscan 14343003.40 12409570.20
pgscan_kswapd 0.00 0.00
pgscan_direct 14343003.40 12409570.20
pgscan_khugepaged 0.00 0.00