Re: [RFC PATCH 2/3] hugetlb: convert page_huge_active() to HPageMigratable flag

From: Mike Kravetz
Date: Fri Jan 15 2021 - 12:44:40 EST


On 1/15/21 1:17 AM, Oscar Salvador wrote:
> On Mon, Jan 11, 2021 at 01:01:51PM -0800, Mike Kravetz wrote:
>> Use the new hugetlb page specific flag to replace the page_huge_active
>> interfaces. By it's name, page_huge_active implied that a huge page
>> was on the active list. However, that is not really what code checking
>> the flag wanted to know. It really wanted to determine if the huge
>> page could be migrated. This happens when the page is actually added
>> the page cache and/or task page table. This is the reasoning behind the
>> name change.
>>
>> The VM_BUG_ON_PAGE() calls in the interfaces were not really necessary
>> as in all case but one we KNOW the page is a hugetlb page. Therefore,
>> they are removed. In one call to HPageMigratable() is it possible for
>> the page to not be a hugetlb page due to a race. However, the code
>> making the call (scan_movable_pages) is inherently racy, and page state
>> will be validated later in the migration process.
>>
>> Note: Since HPageMigratable is used outside hugetlb.c, it can not be
>> static. Therefore, a new set of hugetlb page flag macros is added for
>> non-static flag functions.
>
> Two things about this one:
>
> I am not sure about the name of this one.
> It is true that page_huge_active() was only called by memory-hotplug and all
> it wanted to know was whether the page was in-use and so if it made sense
> to migrate it, so I see some value in the new PageMigratable flag.
>
> However, not all in-use hugetlb can be migrated, e.g: we might have constraints
> when it comes to migrate certain sizes of hugetlb, right?
> So setting HPageMigratable to all active hugetlb pages might be a bit misleading?
> HPageActive maybe? (Sorry, don't have a replacement)

You concerns about the name change are correct.

The reason for the change came about from discussions about Muchun's series
of fixes and the need for a new 'page is freed' status to fix a race. In
that discussion, Michal asked 'Why can't we simply set page_huge_active when
the page is allocated and put on the active list?'. That is mentioned above,
but we really do not want to try and migrate pages after they are allocated
and before they are in use. That causes problems in the fault handling code.

Anyway, that is how the suggestion for Migration came about.

In that discussion David Hildenbrand noted that code in alloc_contig_range
should migrate free hugetlb pages, but there is no support for that today.
I plan to look at that if nobody else does. When such code is added, the
name 'Migratable' will become less applicable.

I'm not great at naming. Perhaps 'In_Use' as a flag name might fit better.

> The other thing is that you are right that scan_movable_pages is racy, but
> page_huge_active() was checking if the page had the Head flag set before
> retrieving page[1].
>
> Before the page_huge_active() in scan_movable_pages() we have the
> if (!PageHuge(page)) check, but could it be that between that check and
> the page_huge_active(), the page gets dissolved, and so we are checking
> a wrong page[1]? Am I making sense?

Yes, you are making sense.

The reason I decided to drop the check is because it does not eliminate the
race. Even with that check in page_huge_active, the page could be dissolved
between that check and check of page[1]. There really is no way to eliminate
the race without holding a reference to the page (or hugetlb_lock). That
check in page_huge_active just shortens the race window.

--
Mike Kravetz