Re: [PATCH 0/7] mm/zswap: optimize the scalability of zswap rb-tree

From: Nhat Pham
Date: Wed Dec 06 2023 - 12:24:47 EST


+ Chris Li

Chris, I vaguely remember from our last conversation that you have
some concurrent efforts to use xarray here right?

On Wed, Dec 6, 2023 at 1:46 AM Chengming Zhou
<zhouchengming@xxxxxxxxxxxxx> wrote:
>
> Hi everyone,
>
> This patch series is based on the linux-next 20231205, which depends on
> the "workload-specific and memory pressure-driven zswap writeback" series
> from Nhat Pham.
>
> When testing the zswap performance by using kernel build -j32 in a tmpfs
> directory, I found the scalability of zswap rb-tree is not good, which
> is protected by the only spinlock. That would cause heavy lock contention
> if multiple tasks zswap_store/load concurrently.
>
> So a simple solution is to split the only one zswap rb-tree into multiple
> rb-trees, each corresponds to SWAP_ADDRESS_SPACE_PAGES (64M). This idea is
> from the commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks").
>
> Although this method can't solve the spinlock contention completely, it
> can mitigate much of that contention.
>
> Another problem when testing the zswap using our default zsmalloc is that
> zswap_load() and zswap_writeback_entry() have to malloc a temporary memory
> to support !zpool_can_sleep_mapped().
>
> Optimize it by reusing the percpu crypto_acomp_ctx->dstmem, which is also
> used by zswap_store() and protected by the same percpu crypto_acomp_ctx->mutex.
>
> Thanks for review and comment!
>
> To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> To: Seth Jennings <sjenning@xxxxxxxxxx>
> To: Dan Streetman <ddstreet@xxxxxxxx>
> To: Vitaly Wool <vitaly.wool@xxxxxxxxxxxx>
> To: Nhat Pham <nphamcs@xxxxxxxxx>
> To: Johannes Weiner <hannes@xxxxxxxxxxx>
> To: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> To: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: linux-mm@xxxxxxxxx
> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>
> ---
> Chengming Zhou (7):
> mm/zswap: make sure each swapfile always have zswap rb-tree
> mm/zswap: split zswap rb-tree
> mm/zswap: reuse dstmem when decompress
> mm/zswap: change dstmem size to one page
> mm/zswap: refactor out __zswap_load()
> mm/zswap: cleanup zswap_load()
> mm/zswap: cleanup zswap_reclaim_entry()
>
> include/linux/zswap.h | 4 +-
> mm/swapfile.c | 10 ++-
> mm/zswap.c | 233 +++++++++++++++++++++-----------------------------
> 3 files changed, 106 insertions(+), 141 deletions(-)
> ---
> base-commit: 0f5f12ac05f36f117e793656c3f560625e927f1b
> change-id: 20231206-zswap-lock-optimize-06f45683b02b
>
> Best regards,
> --
> Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>