[PATCH] mm: khugepaged: free xarray nodes left behind by failed collapse_file()

From: Rik van Riel

Date: Tue Jun 16 2026 - 10:55:35 EST


collapse_file() calls xas_create_range() up front to populate the
xarray with nodes spanning the entire collapse range, including empty
slots for the holes (the nr_none entries).

On the success path those nodes are either consumed by the multi-index
store of new_folio or freed by the nr_none retry-entry dance. But every
error path that branches straight to the rollback label -- e.g.
SCAN_TRUNCATED / SCAN_PAGE_LOCK / SCAN_FAIL detected in the main scan
loop (which then sets nr_none = 0 and jumps to rollback), or SCAN_COPY_MC
during the copy -- does neither. The empty nodes are left dangling in
mapping->i_pages and are leaked once the inode is finally evicted. This
is exactly the "THP collapse_file() failed" case that clear_inode()
documents and deliberately tolerates without warning.

syzkaller reproduces it trivially with MADV_COLLAPSE on a sparse shmem
mapping (collapse aborts with SCAN_TRUNCATED because the range is empty),
and also via slab fault injection, which forces xas_create_range() down
the xas_nomem() path before the same abort. kmemleak then reports the
576-byte struct xa_node objects allocated in xas_alloc()/xas_nomem().

The leaked objects are struct xa_node (576 bytes each), left dangling in
mapping->i_pages and reclaimed only when the inode is finally evicted.
Nodes leak when a collapse takes one of the rollback paths (SCAN_TRUNCATED
/ SCAN_PAGE_LOCK / SCAN_FAIL / SCAN_COPY_MC), and only for the empty hole
slots (the nr_none entries); they are not leaked on the success path (the
nodes are consumed by the multi-index store and the nr_none dance), and
slots still holding folios are never touched.

Prune the now-empty nodes on the rollback path. A node is only freed
once its slot count reaches zero, and storing NULL into an
already-empty slot is a no-op, so briefly store an XA_RETRY_ENTRY into
each empty slot and immediately clear it: the clear drops the count
back to zero and lets xas_store() -> xas_delete_node() free the node
and its now-empty ancestors. Slots still holding the original folios
are left untouched.

Fixes: 77da9389b9d5 ("mm: Convert collapse_shmem to XArray")
Assisted-by: Claude:claude-opus-4-8 syzkaller
Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: Lorenzo Stoakes <ljs@xxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Zi Yan <ziy@xxxxxxxxxx>
Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Cc: Liam R. Howlett <liam@xxxxxxxxxxxxx>
Cc: Nico Pache <npache@xxxxxxxxxx>
Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
Cc: Dev Jain <dev.jain@xxxxxxx>
Cc: Barry Song <baohua@xxxxxxxxxx>
Cc: Lance Yang <lance.yang@xxxxxxxxx>
Cc: linux-mm@xxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
mm/khugepaged.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b8452dbdb043..d11a4c9610e1 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2273,6 +2273,39 @@ static enum scan_result collapse_file(struct mm_struct *mm, unsigned long addr,

rollback:
/* Something went wrong: roll back page cache changes */
+
+ /*
+ * xas_create_range() above populated the xarray with nodes spanning
+ * the whole collapse range, including empty slots for the holes
+ * (nr_none entries). On the success path these nodes are consumed by
+ * the multi-index store of new_folio, and the nr_none handling further
+ * up frees the ones covering the holes; but the error paths that branch
+ * straight here do neither. Prune the now-empty nodes explicitly,
+ * otherwise they are leaked until the mapping is torn down -- one of
+ * the two cases called out in the comment in clear_inode().
+ *
+ * A node can only be deleted once its slot count drops to zero, so
+ * briefly store an XA_RETRY_ENTRY into each empty slot and then clear
+ * it again: clearing the retry entry drops the count back to zero and
+ * lets xas_store() -> xas_delete_node() free the node. Slots that
+ * still hold the original folios are left untouched.
+ */
+ xas_lock_irq(&xas);
+ xas_set_order(&xas, start, 0);
+ for (index = start; index < end; index++) {
+ if (!xas_next(&xas)) {
+ xas_store(&xas, XA_RETRY_ENTRY);
+ if (xas_error(&xas))
+ break;
+ }
+ }
+ xas_set_order(&xas, start, 0);
+ for (index = start; index < end; index++) {
+ if (xas_next(&xas) == XA_RETRY_ENTRY)
+ xas_store(&xas, NULL);
+ }
+ xas_unlock_irq(&xas);
+
if (nr_none) {
xas_lock_irq(&xas);
mapping->nrpages -= nr_none;
--
2.53.0-Meta