Re: [RFC PATCH 2/2] mm,page_alloc: Make alloc_contig_range handle free hugetlb pages
From: David Hildenbrand
Date: Thu Feb 25 2021 - 17:21:49 EST
> Am 25.02.2021 um 22:43 schrieb Mike Kravetz <mike.kravetz@xxxxxxxxxx>:
>
> On 2/10/21 12:23 AM, David Hildenbrand wrote:
>>> On 08.02.21 11:38, Oscar Salvador wrote:
>>> --- a/mm/compaction.c
>>> +++ b/mm/compaction.c
>>> @@ -952,6 +952,17 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>>> low_pfn += compound_nr(page) - 1;
>>> goto isolate_success_no_list;
>>> }
>>> + } else {
>>
>> } else if (alloc_and_dissolve_huge_page(page))) {
>>
>> ...
>>
>>> + /*
>>> + * Free hugetlb page. Allocate a new one and
>>> + * dissolve this is if succeed.
>>> + */
>>> + if (alloc_and_dissolve_huge_page(page)) {
>>> + unsigned long order = buddy_order_unsafe(page);
>>> +
>>> + low_pfn += (1UL << order) - 1;
>>> + continue;
>>> + }
>>
>>
>>
>> Note that there is a very ugly corner case we will have to handle gracefully (I think also in patch #1):
>>
>> Assume you allocated a gigantic page (and assume that we are not using CMA for gigantic pages for simplicity). Assume you want to allocate another one. alloc_pool_huge_page()->...->alloc_contig_pages() will stumble over the first allocated page. It will try to alloc_and_dissolve_huge_page() the existing gigantic page. To do that, it will alloc_pool_huge_page()->...->alloc_contig_pages() ... and so on. Bad.
>>
>
> Sorry for resurrecting an old thread.
> While looking at V3 of these patches, I was exploring all the calling
> sequences looking for races and other issues. It 'may' be that the
> issue about infinitely allocating and freeing gigantic pages may not be
> an issue. Of course, I could be mistaken. Here is my reasoning:
>
> alloc_and_dissolve_huge_page (now isolate_or_dissolve_huge_page) will be
> called from __alloc_contig_migrate_range() within alloc_contig_range().
> Before calling __alloc_contig_migrate_range, we call start_isolate_page_range
> to isolate all page blocks in the range. Because all the page blocks in
> the range are isolated, another invocation of alloc_contig_range will
> not operate on any part of that range. See the comments for
> start_isolate_page_range or commit 2c7452a075d4. So, when
> start_isolate_page_range goes to allocate another gigantic page it will
> never notice/operate on the existing gigantic page.
>
> Again, this is confusing and I might be missing something.
I think you are right that the endless loop is blocked. But I think the whole thing could cascade once we have multiple gigantic pages allocated.
Try allocating a new gpage. We find an existing gpage, isolate it and try to migrate it. To do that, we try allocating a new gpage. We find yet another existing gpage, isolate and try to migrate it ... until we isolated all gpages on out way to an actual usable area. Then we have to actually migrate all these in reverse order ...
Of course this only works if we can actually isolate a gigantic page - which should be the case I think (they are migratable and should be marked as movable).
>
> In any case, I agree that gigantic pages are tricky and we should leave
> them out of the discussion for now. We can rethink this later if
> necessary.
Yes, it‘s tricky and not strictly required right now because we never place them on ZONE_MOVABLE. And as I said, actual use cases might be rare.