Re: [PATCH mm-unstable 1/1] mm: fix deferred split queue races during migration
From: Andrew Morton
Date: Wed Apr 01 2026 - 19:20:06 EST
On Wed, 01 Apr 2026 18:55:48 -0400 Zi Yan <ziy@xxxxxxxxxx> wrote:
> Can you apply the fixup below to move the comment? Lance told me he
> would be away for a while, so he could not send a fixup to move
> the comment.
Thanks. I folded that into Lance's base patch so here's the whole
thing:
From: Lance Yang <lance.yang@xxxxxxxxx>
Subject: mm: fix deferred split queue races during migration
Date: Wed, 1 Apr 2026 21:10:32 +0800
migrate_folio_move() records the deferred split queue state from src and
replays it on dst. Replaying it after remove_migration_ptes(src, dst, 0)
makes dst visible before it is requeued, so a concurrent rmap-removal path
can mark dst partially mapped and trip the WARN in deferred_split_folio().
Move the requeue before remove_migration_ptes() so dst is back on the
deferred split queue before it becomes visible again.
Because migration still holds dst locked at that point, teach
deferred_split_scan() to requeue a folio when folio_trylock() fails.
Otherwise a fully mapped underused folio can be dequeued by the shrinker
and silently lost from split_queue.
[ziy@xxxxxxxxxx: move the comment]
Link: https://lkml.kernel.org/r/FB71A764-0F10-4E5A-B4A0-BA4C7F138408@xxxxxxxxxx
Link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085
Link: https://lkml.kernel.org/r/20260401131032.13011-1-lance.yang@xxxxxxxxx
Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
Signed-off-by: Lance Yang <lance.yang@xxxxxxxxx>
Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
Reported-by: syzbot+a7067a757858ac8eb085@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@xxxxxxxxxx/
Suggested-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
Acked-by: Zi Yan <ziy@xxxxxxxxxx>
Cc: Alistair Popple <apopple@xxxxxxxxxx>
Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Cc: Barry Song <baohua@xxxxxxxxxx>
Cc: Byungchul Park <byungchul@xxxxxx>
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: Deepanshu Kartikey <kartikey406@xxxxxxxxx>
Cc: Dev Jain <dev.jain@xxxxxxx>
Cc: Gregory Price <gourry@xxxxxxxxxx>
Cc: "Huang, Ying" <ying.huang@xxxxxxxxxxxxxxxxx>
Cc: Joshua Hahn <joshua.hahnjy@xxxxxxxxx>
Cc: Lance Yang <lance.yang@xxxxxxxxx>
Cc: Liam Howlett <liam.howlett@xxxxxxxxxx>
Cc: Lorenzo Stoakes (Oracle) <ljs@xxxxxxxxxx>
Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
Cc: Nico Pache <npache@xxxxxxxxxx>
Cc: Rakie Kim <rakie.kim@xxxxxx>
Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
Cc: Wei Yang <richard.weiyang@xxxxxxxxx>
Cc: Ying Huang <ying.huang@xxxxxxxxxxxxxxxxx>
Cc: Usama Arif <usama.arif@xxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---
mm/huge_memory.c | 15 ++++++++++-----
mm/migrate.c | 18 +++++++++---------
2 files changed, 19 insertions(+), 14 deletions(-)
--- a/mm/huge_memory.c~mm-fix-deferred-split-queue-races-during-migration
+++ a/mm/huge_memory.c
@@ -4542,7 +4542,7 @@ retry:
goto next;
}
if (!folio_trylock(folio))
- goto next;
+ goto requeue;
if (!split_folio(folio)) {
did_split = true;
if (underused)
@@ -4551,13 +4551,18 @@ retry:
}
folio_unlock(folio);
next:
+ /*
+ * If thp_underused() returns false, or if split_folio()
+ * succeeds, or if split_folio() fails in the case it was
+ * underused, then consider it used and don't add it back to
+ * split_queue.
+ */
if (did_split || !folio_test_partially_mapped(folio))
continue;
+requeue:
/*
- * Only add back to the queue if folio is partially mapped.
- * If thp_underused returns false, or if split_folio fails
- * in the case it was underused, then consider it used and
- * don't add it back to split_queue.
+ * Add back partially mapped folios, or underused folios that
+ * we could not lock this round.
*/
fqueue = folio_split_queue_lock_irqsave(folio, &flags);
if (list_empty(&folio->_deferred_list)) {
--- a/mm/migrate.c~mm-fix-deferred-split-queue-races-during-migration
+++ a/mm/migrate.c
@@ -1384,6 +1384,15 @@ static int migrate_folio_move(free_folio
goto out;
/*
+ * Requeue the destination folio on the deferred split queue if
+ * the source was on the queue. The source is unqueued in
+ * __folio_migrate_mapping(), so we recorded the state from
+ * before move_to_new_folio().
+ */
+ if (src_deferred_split)
+ deferred_split_folio(dst, src_partially_mapped);
+
+ /*
* When successful, push dst to LRU immediately: so that if it
* turns out to be an mlocked page, remove_migration_ptes() will
* automatically build up the correct dst->mlock_count for it.
@@ -1399,15 +1408,6 @@ static int migrate_folio_move(free_folio
if (old_page_state & PAGE_WAS_MAPPED)
remove_migration_ptes(src, dst, 0);
- /*
- * Requeue the destination folio on the deferred split queue if
- * the source was on the queue. The source is unqueued in
- * __folio_migrate_mapping(), so we recorded the state from
- * before move_to_new_folio().
- */
- if (src_deferred_split)
- deferred_split_folio(dst, src_partially_mapped);
-
out_unlock_both:
folio_unlock(dst);
folio_set_owner_migrate_reason(dst, reason);
_