Re: [RFC PATCH] mm,memory_hotplug: Unlock 1GB-hugetlb on x86_64
From: David Hildenbrand
Date: Thu Feb 28 2019 - 02:38:39 EST
On 27.02.19 23:00, Mike Kravetz wrote:
> On 2/27/19 1:51 PM, Oscar Salvador wrote:
>> On Thu, Feb 21, 2019 at 10:42:12AM +0100, Oscar Salvador wrote:
>>> [1] https://lore.kernel.org/patchwork/patch/998796/
>>>
>>> Signed-off-by: Oscar Salvador <osalvador@xxxxxxx>
>>
>> Any further comments on this?
>> I do have a "concern" I would like to sort out before dropping the RFC:
>>
>> It is the fact that unless we have spare gigantic pages in other notes, the
>> offlining operation will loop forever (until the customer cancels the operation).
>> While I do not really like that, I do think that memory offlining should be done
>> with some sanity, and the administrator should know in advance if the system is going
>> to be able to keep up with the memory pressure, aka: make sure we got what we need in
>> order to make the offlining operation to succeed.
>> That translates to be sure that we have spare gigantic pages and other nodes
>> can take them.
>>
>> Given said that, another thing I thought about is that we could check if we have
>> spare gigantic pages at has_unmovable_pages() time.
>> Something like checking "h->free_huge_pages - h->resv_huge_pages > 0", and if it
>> turns out that we do not have gigantic pages anywhere, just return as we have
>> non-movable pages.
>
> Of course, that check would be racy. Even if there is an available gigantic
> page at has_unmovable_pages() time there is no guarantee it will be there when
> we want to allocate/use it. But, you would at least catch 'most' cases of
> looping forever.
>
>> But I would rather not convulate has_unmovable_pages() with such checks and "trust"
>> the administrator.
I think we have the exact same issue already with huge/ordinary pages if
we are low on memory. We could loop forever.
In the long run, we should properly detect such issues and abort instead
of looping forever I guess. But as we all know, error handling in the
whole offlining part is still far away from being perfect ...
--
Thanks,
David / dhildenb