Re: [PATCH mm-unstable 1/1] mm: fix deferred split queue races during migration
From: Zi Yan
Date: Wed Apr 01 2026 - 15:21:38 EST
On 1 Apr 2026, at 9:10, Lance Yang wrote:
> From: Lance Yang <lance.yang@xxxxxxxxx>
>
> migrate_folio_move() records the deferred split queue state from src and
> replays it on dst. Replaying it after remove_migration_ptes(src, dst, 0)
> makes dst visible before it is requeued, so a concurrent rmap-removal path
> can mark dst partially mapped and trip the WARN in deferred_split_folio().
>
> Move the requeue before remove_migration_ptes() so dst is back on the
> deferred split queue before it becomes visible again.
>
> Because migration still holds dst locked at that point, teach
> deferred_split_scan() to requeue a folio when folio_trylock() fails.
> Otherwise a fully mapped underused folio can be dequeued by the shrinker
> and silently lost from split_queue.
>
> Link: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085
> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
> Reported-by: syzbot+a7067a757858ac8eb085@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://lore.kernel.org/linux-mm/69ccb65b.050a0220.183828.003a.GAE@xxxxxxxxxx/
> Cc: <stable@xxxxxxxxxxxxxxx>
> Suggested-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
> Signed-off-by: Lance Yang <lance.yang@xxxxxxxxx>
> ---
>
> [ Backport note ]
> This patch is a follow-up fix for 8a8ca142a488 ("mm: migrate: requeue
> destination folio on deferred split queue"), which is currently only in
> mm-stable, and should be backported together with it.
>
> Credit for this fix goes to David, thanks!
>
> mm/huge_memory.c | 12 +++++++-----
> mm/migrate.c | 18 +++++++++---------
> 2 files changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index ff9a42abd1b6..ac6d823e351f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -4558,7 +4558,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> goto next;
> }
> if (!folio_trylock(folio))
> - goto next;
> + goto requeue;
> if (!split_folio(folio)) {
> did_split = true;
> if (underused)
> @@ -4569,11 +4569,13 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> next:
> if (did_split || !folio_test_partially_mapped(folio))
> continue;
> +requeue:
> /*
> - * Only add back to the queue if folio is partially mapped.
> - * If thp_underused returns false, or if split_folio fails
> - * in the case it was underused, then consider it used and
> - * don't add it back to split_queue.
> + * Add back partially mapped folios, or underused folios
> + * that we could not lock this round. If thp_underused()
> + * returns false, or if split_folio() succeeds, or if
> + * split_folio() fails in the case it was underused, then
> + * consider it used and don't add it back to split_queue.
> */
Should the sentence
“If thp_underused() returns false, or if split_folio() succeeds, or if
split_folio() fails in the case it was underused, then
consider it used and don't add it back to split_queue.”
be moved to below label next?
Since “thp_underused() returns false” is describing “if (!underused) goto next”,
“split_folio() succeeds” is describing “did_split == true in the if”,
“split_folio() fails in the case it was underused” is describing
“did_split == false and !folio_test_partially_mapped(folio) in the if”.
The first sentence matches the goto requeue for folio_trylock().
Otherwise, LGTM.
Acked-by: Zi Yan <ziy@xxxxxxxxxx>
> fqueue = folio_split_queue_lock_irqsave(folio, &flags);
> if (list_empty(&folio->_deferred_list)) {
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 05cb408846f2..8a64291ab5b4 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1385,6 +1385,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> if (rc)
> goto out;
>
> + /*
> + * Requeue the destination folio on the deferred split queue if
> + * the source was on the queue. The source is unqueued in
> + * __folio_migrate_mapping(), so we recorded the state from
> + * before move_to_new_folio().
> + */
> + if (src_deferred_split)
> + deferred_split_folio(dst, src_partially_mapped);
> +
> /*
> * When successful, push dst to LRU immediately: so that if it
> * turns out to be an mlocked page, remove_migration_ptes() will
> @@ -1401,15 +1410,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> if (old_page_state & PAGE_WAS_MAPPED)
> remove_migration_ptes(src, dst, 0);
>
> - /*
> - * Requeue the destination folio on the deferred split queue if
> - * the source was on the queue. The source is unqueued in
> - * __folio_migrate_mapping(), so we recorded the state from
> - * before move_to_new_folio().
> - */
> - if (src_deferred_split)
> - deferred_split_folio(dst, src_partially_mapped);
> -
> out_unlock_both:
> folio_unlock(dst);
> folio_set_owner_migrate_reason(dst, reason);
> --
> 2.49.0
Best Regards,
Yan, Zi