Re: [PATCH 0/6] hugetlbfs: support free page reporting

From: David Hildenbrand
Date: Wed Jan 06 2021 - 04:42:50 EST


On 06.01.21 04:46, Liang Li wrote:
> A typical usage of hugetlbfs it's to reserve amount of memory
> during the kernel booting stage, and the reserved pages are
> unlikely to return to the buddy system. When application need
> hugepages, kernel will allocate them from the reserved pool.
> when application terminates, huge pages will return to the
> reserved pool and are kept in the free list for hugetlbfs,
> these free pages will not return to buddy freelist unless the
> size of reserved pool is changed.
> Free page reporting only supports buddy pages, it can't report
> the free pages reserved for hugetlbfs. On the other hand,
> hugetlbfs is a good choice for system with a huge amount of RAM,
> because it can help to reduce the memory management overhead and
> improve system performance.
> This patch add the support for reporting hugepages in the free
> list of hugetlbfs, it can be used by virtio_balloon driver for
> memory overcommit and pre zero out free pages for speeding up
> memory population and page fault handling.

You should lay out the use case + measurements. Further you should
describe what this patch set actually does, how behavior can be tuned,
pros and cons, etc... And you should most probably keep this RFC.

>
> Most of the code are 'copied' from free page reporting because
> they are working in the same way. So the code can be refined to
> remove duplication. It can be done later.

Nothing speaks about getting it right from the beginning. Otherwise it
will most likely never happen.

>
> Since some guys have some concern about side effect of the 'buddy
> free page pre zero out' feature brings, I remove it from this
> serier.

You should really point out what changed size the last version. I
remember Alex and Mike had some pretty solid points of what they don't
want to see (especially: don't use free page reporting infrastructure
and don't temporarily allocate huge pages for processing them).

I am not convinced that we want to use the free page reporting
infrastructure for this (pre-zeroing huge pages). What speaks about a
thread simply iterating over huge pages one at a time, zeroing them? The
whole free page reporting infrastructure was invented because we have to
do expensive coordination (+ locking) when going via the hypervisor. For
the main use case of zeroing huge pages in the background, I don't see a
real need for that. If you believe this is the right thing to do, please
add a discussion regarding this.

--
Thanks,

David / dhildenb