Re: [PATCH] mm: migrate: fix an incorrect call of prep_transhuge_page()

From: Zi Yan
Date: Wed Nov 22 2017 - 04:18:53 EST


On 22 Nov 2017, at 3:54, Michal Hocko wrote:

> On Mon 20-11-17 21:18:55, Zi Yan wrote:
>> From: Zi Yan <zi.yan@xxxxxxxxxxxxxx>
>>
>> In [1], Andrea reported that during memory hotplug/hot remove
>> prep_transhuge_page() is called incorrectly on non-THP pages for
>> migration, when THP is on but THP migration is not enabled.
>> This leads to a bad state of target pages for migration.
>>
>> This patch fixes it by only calling prep_transhuge_page() when we are
>> certain that the target page is THP.
>>
>> [1] https://lkml.org/lkml/2017/11/20/411
>
> lkml.org tends to be quite unstable so a
> http://lkml.kernel.org/r/$msg-id is usually a preferred way.

Got it. Thanks.

>
>>
>> Cc: stable@xxxxxxxxxxxxxxx # v4.14
>> Fixes: 8135d8926c08 ("mm: memory_hotplug: memory hotremove supports thp migration")
>> Reported-by: Andrea Reale <ar@xxxxxxxxxxxxxxxxxx>
>> Signed-off-by: Zi Yan <zi.yan@xxxxxxxxxxxxxx>
>> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
>> Cc: "JÃrÃme Glisse" <jglisse@xxxxxxxxxx>
>> ---
>> include/linux/migrate.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
>> index 895ec0c4942e..a2246cf670ba 100644
>> --- a/include/linux/migrate.h
>> +++ b/include/linux/migrate.h
>> @@ -54,7 +54,7 @@ static inline struct page *new_page_nodemask(struct page *page,
>> new_page = __alloc_pages_nodemask(gfp_mask, order,
>> preferred_nid, nodemask);
>>
>> - if (new_page && PageTransHuge(page))
>> + if (new_page && PageTransHuge(new_page))
>> prep_transhuge_page(new_page);
>
> I would keep the two checks consistent. But that leads to a more
> interesting question. new_page_nodemask does
>
> if (thp_migration_supported() && PageTransHuge(page)) {
> order = HPAGE_PMD_ORDER;
> gfp_mask |= GFP_TRANSHUGE;
> }
>
> How come it is safe to allocate an order-0 page if
> !thp_migration_supported() when we are about to migrate THP? This
> doesn't make any sense to me. Are we working around this somewhere else?
> Why shouldn't we simply return NULL here?

If !thp_migration_supported(), we will first split a THP and migrate its head page. This process
is done in unmap_and_move() after get_new_page() (the function pointer to this new_page_nodemask())
is called. The situation can be PageTransHuge(page) is true here, but the page is split
in unmap_and_move(), so we want to return a order-0 page here.

I think the confusion comes from that there is no guarantee of THP allocation when we are
doing THP migration. If we can allocate a THP during THP migration, we are good. Otherwise, we want to
fallback to the old way, splitting the original THP and migrating the head page, to preserve
the original code behavior.

Does it clarify your confusion?


â
Best Regards,
Yan Zi

Attachment: signature.asc
Description: OpenPGP digital signature