Re: [PATCH] mm/migrate_device: fix folio refcount leak on folio_split_unmapped failure
From: Usama Arif
Date: Thu Mar 05 2026 - 06:46:07 EST
On 05/03/2026 06:09, Mika Penttilä wrote:
> Hi!
>
> On 3/5/26 01:28, Usama Arif wrote:
>
>>
>> On 04/03/2026 22:09, Balbir Singh wrote:
>>> On 3/5/26 08:54, Zi Yan wrote:
>>>> On 4 Mar 2026, at 16:48, Balbir Singh wrote:
>>>>
>>>>> On 3/5/26 02:17, Zi Yan wrote:
>>>>>> On 4 Mar 2026, at 7:01, Usama Arif wrote:
>>>>>>
>>>>>>> From: Usama Arif <usama.arif@xxxxxxxxx>
>>>>>>>
>>>>>>> migrate_vma_split_unmapped_folio() takes an extra reference via
>>>>>>> folio_get() before calling folio_split_unmapped(). On success, the
>>>>>>> split consumes this reference: __folio_freeze_and_split_unmapped()
>>>>>>> expects the +1 in its folio_ref_freeze() check, and distributes it
>>>>>>> across the resulting sub-folios via folio_ref_unfreeze(...+1), which
>>>>>>> are later balanced by folio_put() calls in __migrate_device_finalize().
>>>>>>>
>>>>>>> If folio_split_unmapped() fails (e.g., unexpected pinning returns
>>>>>>> -EAGAIN), the function returns without calling folio_put(). The extra
>>>>>>> reference is never released.
>>>>>>>
>>>>>>> Add the missing folio_put() on the error path.
>>>>>>>
>>>>>>> Fixes: 4265d67e405a4 ("mm/migrate_device: add THP splitting during migration")
>>>>>>> Closes: https://lore.kernel.org/all/CAA1CXcDyqPPwf_-W7B+PFQtL8HdoJGCEqVsVxq7DhOUB=L4PQA@xxxxxxxxxxxxxx/
>>>>>>> Reported-by: Nico Pache <npache@xxxxxxxxxx>
>>>>>>> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxx>
>>>>>>> ---
>>>>>>> mm/migrate_device.c | 4 +++-
>>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>>>>>>> index 0a8b31939640f..351ecd9065d13 100644
>>>>>>> --- a/mm/migrate_device.c
>>>>>>> +++ b/mm/migrate_device.c
>>>>>>> @@ -917,8 +917,10 @@ static int migrate_vma_split_unmapped_folio(struct migrate_vma *migrate,
>>>>>>> folio_get(folio);
>>>>>>> split_huge_pmd_address(migrate->vma, addr, true);
>>>>>>> ret = folio_split_unmapped(folio, 0);
>>>>>>> - if (ret)
>>>>>>> + if (ret) {
>>>>>>> + folio_put(folio);
>>>>>>> return ret;
>>>>>>> + }
>>>>>>> migrate->src[idx] &= ~MIGRATE_PFN_COMPOUND;
>>>>>>> flags = migrate->src[idx] & ((1UL << MIGRATE_PFN_SHIFT) - 1);
>>>>>>> pfn = migrate->src[idx] >> MIGRATE_PFN_SHIFT;
>>>>>>> --
>>>>>>> 2.47.3
>>>>>> Add Balbir, who wrote the code, to comment on this.
>>>>>>
>>>>> Thanks Zi!
>>>>>
>>>>> Just wondering if there is a reproducer for the issue and how the fix was tested?
>>>>> I expect migrate_vma_finalize() to be called for folios, even when split failed and
>>>>> drop the lock.
>>>> Does migrate_vma_finalize() do folio_put() for failed-to-split folios?
>>>> If so, how does it distinguish between split folios and failed-to-split folios?
>>>> By comparing source and destination folio orders?
>>>>
>>> We reset the MIGRATE_PFN_MIGRATE flag for failing to migrate pfns. We do a folio_put
>>> on the src in finalize, if it is split then on all the split folios as well.
>>>
>>>> What we see from migrate_vma_split_unmapped_folio() is that
>>>> it adds a refcount for all input folios, but only drops a refcount
>>>> for the split folio. Isn’t it cause failed-to-split folios to have
>>>> additional refcount?
>>>>
>> Hello!
>>
>> Thanks for reviewing everyone. So its very difficult to create a reproducer I think
>> the extra reference would need to appear after migrate_device_unmap() but before
>> folio_split_unmapped() in migrate_vma_pages()? That's hard to trigger reliably from
>> userspace.
>>
>> The fix came about when Nico indicated there might be an issue if split_huge_pmd_address
>> fails in my patch [1].
>>
>> Below is my understanding of how refcounting is working over here step by step. I
>> might very well be wrong on this, and the refcounting is a bit all over the place
>> and I might miss a reference change somewhere so would really appreciate if someone
>> can confirm this!
>>
>>
>> 1. migrate_vma_collect_huge_pmd():
>> a) folio_get(folio) -> +1 (collect reference)
>> 2. migrate_device_unmap():
>> a) folio_isolate_lru() -> +1 (isolation reference)
>> b) folio_put() -> -1 (drops the collect reference)
>>
>>
>> Without this patch fix:
>>
>> 3. migrate_vma_split_unmapped_folio():
>> a) folio_get(folio) -> +1 (split reference)
>> b) folio_split_unmapped() -> fails
>> c) Returns error — without folio_put() which is the fix
>> 4. Caller in migrate_vma_pages(): clears MIGRATE_PFN_MIGRATE | MIGRATE_PFN_COMPOUND
>> 5. __migrate_device_finalize(): sees !(src_pfns[i] & MIGRATE_PFN_MIGRATE), restores the folio:
>> a) remove_migration_ptes(src, src) — re-establishes user PTEs
>> b) folio_unlock(src)
>> c) folio_put(src) -> -1 (drops the isolation reference)
>>
>> The split reference in 3.a is never released and the folio has a permanently elevated refcount.
>> Unless I missed a folio_put somewhere for the refcount increase in folio_isolate_lru() (2.b)?
>>
>> Please let me know if this makes sense!
>>
>> [1] https://lore.kernel.org/all/CAA1CXcDyqPPwf_-W7B+PFQtL8HdoJGCEqVsVxq7DhOUB=L4PQA@xxxxxxxxxxxxxx/
>>
>>> Thanks! Yes, the patch makes sense
>>>
>>> Acked-by: Balbir Singh <balbirs@xxxxxxxxxx>
>>>
>>> Balbir
>
> I remember stumbling on this while ago also. The folio_get() in migrate_vma_split_unmapped_folio()
> is balanced with put_page() in __split_huge_pmd_locked() (freeze = true), can't fail for device pages.
> Folios at this point are unmapped but have 1 refcount from "collecting".
> After folio_split_unmapped() the refcount(s) is still 1.
>
> So it seems the code is good as is? A comment though would be good for the extra folio_get..
>
hmm I dont think the put_page() in __split_huge_pmd_locked() is there to balance the folio_get() in
migrate_vma_split_unmapped_folio(). There are other points where split_huge_pmd_locked() is called
with freeze = true [1] and they don't get a reference before calling split_huge_pmd.
I think the folio_put() in __split_huge_pmd_locked() freeze = true case is there as migration
entries are being installed?
[1] https://elixir.bootlin.com/linux/v6.19.3/source/mm/rmap.c#L2334