Re: [PATCH v2 1/3] mm: memory-failure: update ttu flag inside unmap_poisoned_folio

From: Miaohe Lin
Date: Mon Jan 20 2025 - 22:20:25 EST


On 2025/1/20 16:46, David Hildenbrand wrote:
> On 20.01.25 08:49, David Hildenbrand wrote:
>>
>>>>        if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) {
>>>>            struct address_space *mapping;
>>>>    @@ -1572,7 +1598,7 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>            if (!mapping) {
>>>>                pr_info("%#lx: could not lock mapping for mapped hugetlb folio\n",
>>>>                    folio_pfn(folio));
>>>> -            return;
>>>> +            return -EBUSY;
>>>>            }
>>>>               try_to_unmap(folio, ttu|TTU_RMAP_LOCKED);
>>>> @@ -1580,6 +1606,8 @@ void unmap_poisoned_folio(struct folio *folio, enum ttu_flags ttu)
>>>>        } else {
>>>>            try_to_unmap(folio, ttu);
>>>>        }
>>>> +
>>>> +    return folio_mapped(folio) ? -EBUSY : 0;
>>>
>>> Do we really need this return value? It's unused in do_migrate_range().
>>
>> I suggested it, because the folio_mapped() is nowadays extremely cheap.
>> It cleans up hwpoison_user_mappings() quite nicely.
>
> I'm also wondering, if in do_migrate_range(), we want to pr_warn_ratelimit() in case still mapped after the call. IIUC, we don't really expect this to happen with SYNC set.

Do you mean TTU_SYNC? It seems it's not set.

There might be a race will hit the proposed pr_warn_ratelimit():

/* Assume folio is isolated for reclaim, so memory_failure failed to handle it at first time. Then it's put back to LRU. */
do_migrate_range
folio_test_hwpoison
folio_mapped
<folio is isolated for reclaim again.>
unmap_poisoned_folio
<folio is put buck.>
pr_warn_ratelimit(folio_mapped)

But I might be miss something. And even this race is possible, it should be really hard to hit.

Thanks.
.