Re: [RFC PATCH] mm: bypass swap readahead for zswap
From: Alexandre Ghiti
Date: Thu Jun 25 2026 - 03:26:46 EST
Hi David,
On 6/24/26 16:58, David Hildenbrand (Arm) wrote:
On 6/24/26 09:55, Alexandre Ghiti wrote:
Commit 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous[...]
device") made SWP_SYNCHRONOUS_IO devices (e.g. zram) skip swap readahead.
zswap is the same kind of in-memory, synchronous backend as zram, not a
swap device flagged SWP_SYNCHRONOUS_IO so it still goes through
swapin_readahead().
Here are the results from bypassing readahead for zswap too: it was
measured with a kernel build (make -j16) in a memcg, zswap=zstd, shrinker
off, on Sapphire Rapids and 3 iterations.
768M memcg (sustained swap thrash):
metric mm-new + bypass delta
build time (s) 405.0 341.7 -15.6%
zswap-in (GB) 79.5 53.0 -33%
zswap-out (GB) 144.8 115.6 -20%
swap readahead (pages) 6.79M 0.45M -93%
swap_ra hit (%) 72.1 89.9 +18pp
1G memcg (light pressure, build not memory-bound):
metric mm-new + bypass delta
build time (s) 177.7 176.0 ~same (no regression)
zswap-in (GB) 10.2 7.5 -26%
zswap-out (GB) 27.7 25.1 -9%
swap readahead (pages) 1.07M 0.08M -93%
swap_ra hit (%) 68.6 87.2 +19pp
The gain is from no longer prefetching pages that are pointless for an
in-memory backend: readahead inflates anon residency and thrashes the
page cache (file pages get evicted and re-read), lengthens each fault by
synchronously (de)compressing a cluster of neighbours, and adds
compression traffic when those extra pages are reclaimed.
Bypassing swap readahead for zswap therefore makes sense.
Signed-off-by: Alexandre Ghiti <alex@xxxxxxxx>
---
#endif /* _LINUX_ZSWAP_H */This should really be abstracted into a reasonably-named helper that can live in
diff --git a/mm/memory.c b/mm/memory.c
index ff338c2abe92..5aa1ea9eb48a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4827,8 +4827,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (folio)
swap_update_readahead(folio, vma, vmf->address);
if (!folio) {
- /* Swapin bypasses readahead for SWP_SYNCHRONOUS_IO devices */
- if (data_race(si->flags & SWP_SYNCHRONOUS_IO))
+ /* Swapin bypasses readahead for SWP_SYNCHRONOUS_IO devices and zswap */
+ if (data_race(si->flags & SWP_SYNCHRONOUS_IO) ||
+ zswap_present_test(entry))
swap code.
Makes sense, I'll come up with something.
Thanks,
Alex