Re: [PATCH v2 9/9] mm: zswap: per-node kmem accounting for zswap/zsmalloc

From: Nhat Pham

Date: Mon Jun 29 2026 - 14:14:45 EST


On Mon, Jun 29, 2026 at 6:14 AM Alexandre Ghiti <alex@xxxxxxxx> wrote:
>
> Hi Usama,
>
> On 6/26/26 16:32, Usama Arif wrote:
> > On Fri, 26 Jun 2026 12:20:58 +0200 Alexandre Ghiti <alex@xxxxxxxx> wrote:
> >
> >> Update zswap and zsmalloc to use per-node obj_cgroup for kmem
> >> accounting, attributing compressed page charges to the correct
> >> NUMA node.
> >>
> >> But actually, this is incomplete because it does not correctly account
> >> for entries that straddle pages, those pages being possibly on 2 different
> >> nodes.
> >>
> >> This will be correctly handled by Joshua in a different series [1].
> >>
> >> Link: https://lore.kernel.org/linux-mm/20260311195153.4013476-1-joshua.hahnjy@xxxxxxxxx/ [1]
> >> Signed-off-by: Alexandre Ghiti <alex@xxxxxxxx>
> >> ---
> >> include/linux/zsmalloc.h | 2 ++
> >> mm/zsmalloc.c | 11 +++++++++++
> >> mm/zswap.c | 19 ++++++++++++++++++-
> >> 3 files changed, 31 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h
> >> index 478410c880b1..30427f3fe232 100644
> >> --- a/include/linux/zsmalloc.h
> >> +++ b/include/linux/zsmalloc.h
> >> @@ -50,6 +50,8 @@ void zs_obj_read_sg_end(struct zs_pool *pool, unsigned long handle);
> >> void zs_obj_write(struct zs_pool *pool, unsigned long handle,
> >> void *handle_mem, size_t mem_len);
> >>
> >> +int zs_handle_to_nid(struct zs_pool *pool, unsigned long handle);
> >> +
> >> extern const struct movable_operations zsmalloc_mops;
> >>
> >> #endif
> >> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> >> index 83f5820c45f9..17f7403ebe77 100644
> >> --- a/mm/zsmalloc.c
> >> +++ b/mm/zsmalloc.c
> >> @@ -1380,6 +1380,17 @@ static void obj_free(int class_size, unsigned long obj)
> >> mod_zspage_inuse(zspage, -1);
> >> }
> >>
> >> +int zs_handle_to_nid(struct zs_pool *pool, unsigned long handle)
> >> +{
> >> + unsigned long obj;
> >> + struct zpdesc *zpdesc;
> >> +
> >> + obj = handle_to_obj(handle);
> >> + obj_to_zpdesc(obj, &zpdesc);
> >> + return page_to_nid(zpdesc_page(zpdesc));
> >> +}
> >> +EXPORT_SYMBOL(zs_handle_to_nid);
> > Does this need the same locking as the other handle-to-zspage paths?
> > zs_free() takes pool->lock before handle_to_obj() because zspage migration can
> > update or move the object behind the handle. This helper does the same decode
> > without the lock, so zswap's uncharge path can race migration and charge or
> > uncharge the wrong node, or observe transient zspage state.
>
>
> You're totally right, I missed this, thanks!
>
> Thanks,
>
> Alex

If we are to do this, is there a way to extend zsmalloc's interface so
that it returns the initial node placement together with the handle in
zs_malloc()?

That way, we can avoid going through the zsmalloc locks again. It's
quiet expensive, especially with compaction in the picture - the
pool->lock is a global rwlock :)