Re: [RFC v2 12/21] mm: thp: handle split failure in device migration
From: Usama Arif
Date: Thu Mar 05 2026 - 12:01:54 EST
On 02/03/2026 21:20, Nico Pache wrote:
> On Thu, Feb 26, 2026 at 4:34 AM Usama Arif <usama.arif@xxxxxxxxx> wrote:
>>
>> Device memory migration has two call sites that split huge PMDs:
>>
>> migrate_vma_split_unmapped_folio():
>> Called from migrate_vma_pages() when migrating a PMD-mapped THP to a
>> destination that doesn't support compound pages. It splits the PMD
>> then splits the folio via folio_split_unmapped().
>>
>> If the PMD split fails, folio_split_unmapped() would operate on an
>> unsplit folio with inconsistent page table state. Propagate -ENOMEM
>> to skip this page's migration. This is safe as folio_split_unmapped
>> failure would be propagated in a similar way.
>>
>> migrate_vma_insert_page():
>> Called from migrate_vma_pages() when inserting a page into a VMA
>> during migration back from device memory. If a huge zero PMD exists
>> at the target address, it must be split before PTE insertion.
>>
>> If the split fails, the subsequent pte_alloc() and set_pte_at() would
>> operate on a PMD slot still occupied by the huge zero entry. Use
>> goto abort, consistent with other allocation failures in this function.
>>
>> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxx>
>> ---
>> mm/migrate_device.c | 16 ++++++++++++++--
>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>> index 78c7acf024615..bc53e06fd9735 100644
>> --- a/mm/migrate_device.c
>> +++ b/mm/migrate_device.c
>> @@ -909,7 +909,13 @@ static int migrate_vma_split_unmapped_folio(struct migrate_vma *migrate,
>> int ret = 0;
>>
>> folio_get(folio);
>
> Should we be concerned about this folio_get? Are we incrementing a
> reference that was already held if we back out of the split?
>
> -- Nico
Hi Nico,
Thanks for pointing this out. It spun out to an entire investigation for me [1].
Similar to [1], I inserted trace prints [2] and created a new __split_huge_pmd2
that always returns -ENOMEM. Without folio_put on error [3], we get a refcount of 2.
hmm-tests-129 [000] .l... 1.485514: __migrate_device_finalize: FINALIZE[0]: src=ffb48827440e8000 dst=ffb48827440e8000 src==dst=1 refcount_src=2 mapcount_src=0 order_src=9 migrate=0 BEFORE remove_migration_ptes
hmm-tests-129 [000] .l... 1.485517: __migrate_device_finalize: FINALIZE[0]: src=ffb48827440e8000 refcount=3 mapcount=1 AFTER remove_migration_ptes
hmm-tests-129 [000] .l... 1.485518: __migrate_device_finalize: FINALIZE[0]: src=ffb48827440e8000 refcount=2 AFTER folio_put(src)
With folio_put on error [4], we get a refcount of 1.
hmm-tests-129 [001] ..... 1.492216: __migrate_device_finalize: FINALIZE[0]: src=fff7b8be840f0000 dst=fff7b8be840f0000 src==dst=1 refcount_src=1 mapcount_src=0 order_src=9 migrate=0 BEFORE remove_migration_ptes
hmm-tests-129 [001] ..... 1.492219: __migrate_device_finalize: FINALIZE[0]: src=fff7b8be840f0000 refcount=2 mapcount=1 AFTER remove_migration_ptes
hmm-tests-129 [001] ..... 1.492220: __migrate_device_finalize: FINALIZE[0]: src=fff7b8be840f0000 refcount=1 AFTER folio_put(src)
So we need folio_put for split_huge_pmd_address failure, but NOT for
folio_split_unmapped.
[1] https://lore.kernel.org/all/332c9e16-46c3-4e1c-898e-2cb0a87ba1fc@xxxxxxxxx/
[2] https://gist.github.com/uarif1/6abe4bedb85814e9be8d48a4fe742b41
[3] https://gist.github.com/uarif1/f718af2113bc1a33484674b61b9dafcc
[4] https://gist.github.com/uarif1/03c42f2549eaf2bc555e8b03e07a63c8