[RFC PATCH] mm: bypass swap readahead for zswap

From: Alexandre Ghiti

Date: Wed Jun 24 2026 - 03:58:44 EST


Commit 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous
device") made SWP_SYNCHRONOUS_IO devices (e.g. zram) skip swap readahead.

zswap is the same kind of in-memory, synchronous backend as zram, not a
swap device flagged SWP_SYNCHRONOUS_IO so it still goes through
swapin_readahead().

Here are the results from bypassing readahead for zswap too: it was
measured with a kernel build (make -j16) in a memcg, zswap=zstd, shrinker
off, on Sapphire Rapids and 3 iterations.

768M memcg (sustained swap thrash):
metric mm-new + bypass delta
build time (s) 405.0 341.7 -15.6%
zswap-in (GB) 79.5 53.0 -33%
zswap-out (GB) 144.8 115.6 -20%
swap readahead (pages) 6.79M 0.45M -93%
swap_ra hit (%) 72.1 89.9 +18pp

1G memcg (light pressure, build not memory-bound):
metric mm-new + bypass delta
build time (s) 177.7 176.0 ~same (no regression)
zswap-in (GB) 10.2 7.5 -26%
zswap-out (GB) 27.7 25.1 -9%
swap readahead (pages) 1.07M 0.08M -93%
swap_ra hit (%) 68.6 87.2 +19pp

The gain is from no longer prefetching pages that are pointless for an
in-memory backend: readahead inflates anon residency and thrashes the
page cache (file pages get evicted and re-read), lengthens each fault by
synchronously (de)compressing a cluster of neighbours, and adds
compression traffic when those extra pages are reclaimed.

Bypassing swap readahead for zswap therefore makes sense.

Signed-off-by: Alexandre Ghiti <alex@xxxxxxxx>
---

- This bypass originally comes from Usama's series that implements
large folio zswapin: while working on improving this series, I noticed
the gains I got only came from the bypass of readahead.

include/linux/zswap.h | 6 ++++++
mm/memory.c | 5 +++--
mm/zswap.c | 11 +++++++++++
3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/linux/zswap.h b/include/linux/zswap.h
index 30c193a1207e..b6f0e6198b6f 100644
--- a/include/linux/zswap.h
+++ b/include/linux/zswap.h
@@ -35,6 +35,7 @@ void zswap_lruvec_state_init(struct lruvec *lruvec);
void zswap_folio_swapin(struct folio *folio);
bool zswap_is_enabled(void);
bool zswap_never_enabled(void);
+bool zswap_present_test(swp_entry_t swp);
#else

struct zswap_lruvec_state {};
@@ -69,6 +70,11 @@ static inline bool zswap_never_enabled(void)
return true;
}

+static inline bool zswap_present_test(swp_entry_t swp)
+{
+ return false;
+}
+
#endif

#endif /* _LINUX_ZSWAP_H */
diff --git a/mm/memory.c b/mm/memory.c
index ff338c2abe92..5aa1ea9eb48a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4827,8 +4827,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (folio)
swap_update_readahead(folio, vma, vmf->address);
if (!folio) {
- /* Swapin bypasses readahead for SWP_SYNCHRONOUS_IO devices */
- if (data_race(si->flags & SWP_SYNCHRONOUS_IO))
+ /* Swapin bypasses readahead for SWP_SYNCHRONOUS_IO devices and zswap */
+ if (data_race(si->flags & SWP_SYNCHRONOUS_IO) ||
+ zswap_present_test(entry))
folio = swapin_sync(entry, GFP_HIGHUSER_MOVABLE,
thp_swapin_suitable_orders(vmf) | BIT(0),
vmf, NULL, 0);
diff --git a/mm/zswap.c b/mm/zswap.c
index 761cd699e0a3..5b85b4d17647 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -234,6 +234,17 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp)
>> ZSWAP_ADDRESS_SPACE_SHIFT];
}

+/**
+ * zswap_present_test - check if a swap entry is currently backed by zswap
+ * @swp: the swap entry to test
+ *
+ * Return: true if @swp has a zswap entry, false otherwise.
+ */
+bool zswap_present_test(swp_entry_t swp)
+{
+ return xa_load(swap_zswap_tree(swp), swp_offset(swp));
+}
+
#define zswap_pool_debug(msg, p) \
pr_debug("%s pool %s\n", msg, (p)->tfm_name)

--
2.54.0