[RFC PATCH 2/2] zswap: add sysfs knob for same node mode
From: Nhat Pham
Date: Sat Mar 29 2025 - 07:02:56 EST
Taking advantage of the new node-selection capability of zsmalloc, allow
zswap to keep the compressed copy in the same node as the original page.
The main use case is for CXL systems, where pages in CXL tier should
stay in CXL when they are zswapped so as not to create memory pressure
in higher tier.
This new behavior is opted-in only, and can be enabled as follows:
echo Y > /sys/module/zswap/parameters/same_node_mode
Suggested-by: Gregory Price <gourry@xxxxxxxxxx>
Signed-off-by: Nhat Pham <nphamcs@xxxxxxxxx>
---
Documentation/admin-guide/mm/zswap.rst | 9 +++++++++
mm/zswap.c | 10 ++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/mm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst
index fd3370aa43fe..be8953acc15e 100644
--- a/Documentation/admin-guide/mm/zswap.rst
+++ b/Documentation/admin-guide/mm/zswap.rst
@@ -142,6 +142,15 @@ User can enable it as follows::
This can be enabled at the boot time if ``CONFIG_ZSWAP_SHRINKER_DEFAULT_ON`` is
selected.
+In a NUMA system, sometimes we want the compressed copy to reside in the same
+node as the original page. For instance, if we use the NUMA nodes to represent
+a CXL-based memory tiering system, we do not want the pages demoted to the
+lower tier to accidentally return to the higher tier via zswap, creating
+memory pressure in the higher tier. The same-node behavior can be enabled
+as follows::
+
+ echo Y > /sys/module/zswap/parameters/same_node_mode
+
A debugfs interface is provided for various statistic about pool size, number
of pages stored, same-value filled pages and various counters for the reasons
pages are rejected.
diff --git a/mm/zswap.c b/mm/zswap.c
index 89b6d4ade4cd..2eee57648750 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -129,6 +129,9 @@ static bool zswap_shrinker_enabled = IS_ENABLED(
CONFIG_ZSWAP_SHRINKER_DEFAULT_ON);
module_param_named(shrinker_enabled, zswap_shrinker_enabled, bool, 0644);
+static bool zswap_same_node_mode;
+module_param_named(same_node_mode, zswap_same_node_mode, bool, 0644);
+
bool zswap_is_enabled(void)
{
return zswap_enabled;
@@ -942,7 +945,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
{
struct crypto_acomp_ctx *acomp_ctx;
struct scatterlist input, output;
- int comp_ret = 0, alloc_ret = 0;
+ int comp_ret = 0, alloc_ret = 0, nid = page_to_nid(page);
unsigned int dlen = PAGE_SIZE;
unsigned long handle;
struct zpool *zpool;
@@ -981,7 +984,10 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
zpool = pool->zpool;
gfp = GFP_NOWAIT | __GFP_NORETRY | __GFP_HIGHMEM | __GFP_MOVABLE;
- alloc_ret = zpool_malloc(zpool, dlen, gfp, &handle, NULL);
+ if (zswap_same_node_mode)
+ alloc_ret = zpool_malloc(zpool, dlen, gfp, &handle, &nid);
+ else
+ alloc_ret = zpool_malloc(zpool, dlen, gfp, &handle, NULL);
if (alloc_ret)
goto unlock;
--
2.47.1