Re: [PATCH] mm/zsmalloc: fix NULL pointer dereference in get_next_zpdesc

From: Sergey Senozhatsky

Date: Wed Feb 18 2026 - 00:46:48 EST


On (26/02/09 19:37), Michael Fara wrote:
[..]
> The sequence is:
> 1. Compaction calls zs_page_isolate() on a zpdesc, then drops its
> page lock.
> 2. Concurrently, async_free_zspage() or free_zspage() destroys the
> zspage, calling reset_zpdesc() which sets zpdesc->zspage = NULL.
> 3. A subsequent zs_free() path calls trylock_zspage(), which iterates
> zpdescs via get_next_zpdesc(). get_zspage() dereferences the now-
> NULL backpointer, causing:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> RIP: 0010:free_zspage+0x26/0x100
> Call Trace:
> zs_free+0xf4/0x110
> zswap_entry_free+0x7e/0x160
>
> The migration side already has a NULL guard (zs_page_migrate line 1675:
> "if (!zpdesc->zspage) return 0;"), but get_next_zpdesc() lacks the same
> protection.
>
> Fix this by reading zpdesc->zspage directly in get_next_zpdesc()
> instead of going through get_zspage(), and returning NULL when the
> backpointer is NULL. This stops iteration safely — the caller treats
> it as the end of the page chain.
>
> Signed-off-by: Michael Fara <mjfara@xxxxxxxxx>
> ---
> mm/zsmalloc.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -735,7 +735,19 @@ static struct zspage *get_zspage(struct zpdesc *zpdesc)
>
> static struct zpdesc *get_next_zpdesc(struct zpdesc *zpdesc)
> {
> - struct zspage *zspage = get_zspage(zpdesc);
> + struct zspage *zspage = zpdesc->zspage;
> +
> + /*
> + * If the backpointer is NULL, this zpdesc was already freed via
> + * reset_zpdesc() by a racing async_free_zspage() while isolated
> + * for compaction. See the TODO comment in zs_page_migrate().
> + */
> + if (unlikely(!zspage)) {
> + WARN_ON_ONCE(1);
> + return NULL;
> + }

I need to look closer, but the quick glance suggests that this is a
problematic approach. We can't just return NULL from get_next_zpdesc()
because this can potentially cause issues in the callers. E.g.
trylock_zspage() will treat NULL as the end of the page chain and
return success, which is clearly wrong. We also have a bunch of
other callers that never expect NULL from get_next_zpdesc().