Re: [External] Re: [v2 1/6] mm: hugetlb: Skip prep of tail pages when HVO is enabled

From: Usama Arif
Date: Wed Aug 02 2023 - 06:06:07 EST




On 01/08/2023 00:18, Mike Kravetz wrote:
On 07/30/23 16:16, Usama Arif wrote:
When vmemmap is optimizable, it will free all the duplicated tail
pages in hugetlb_vmemmap_optimize while preparing the new hugepage.
Hence, there is no need to prepare them.

For 1G x86 hugepages, it avoids preparing
262144 - 64 = 262080 struct pages per hugepage.

The indirection of using __prep_compound_gigantic_folio is also removed,
as it just creates extra functions to indicate demote which can be done
with the argument.

Signed-off-by: Usama Arif <usama.arif@xxxxxxxxxxxxx>
---
mm/hugetlb.c | 32 ++++++++++++++------------------
mm/hugetlb_vmemmap.c | 2 +-
mm/hugetlb_vmemmap.h | 15 +++++++++++----
3 files changed, 26 insertions(+), 23 deletions(-)

Thanks,

I just started looking at this series. Adding Muchun on Cc:


diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 64a3239b6407..541c07b6d60f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1942,14 +1942,23 @@ static void prep_new_hugetlb_folio(struct hstate *h, struct folio *folio, int ni
spin_unlock_irq(&hugetlb_lock);
}
-static bool __prep_compound_gigantic_folio(struct folio *folio,
- unsigned int order, bool demote)
+static bool prep_compound_gigantic_folio(struct folio *folio, struct hstate *h, bool demote)
{
int i, j;
+ int order = huge_page_order(h);
int nr_pages = 1 << order;
struct page *p;
__folio_clear_reserved(folio);
+
+ /*
+ * No need to prep pages that will be freed later by hugetlb_vmemmap_optimize.
+ * Hence, reduce nr_pages to the pages that will be kept.
+ */
+ if (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) &&
+ vmemmap_should_optimize(h, &folio->page))

IIUC, vmemmap_optimize_enabled (checked in vmemmap_should_optimize) can be
modified at runtime via sysctl. If so, what prevents it from being changed
after this check and the later call to hugetlb_vmemmap_optimize()?

Hi,

Thanks for the review.

Yes thats a good catch. The solution for this issue would be to to turn hugetlb_free_vmemmap into a callback core_param and have a lock around the write and when its used in gather_bootmem_prealloc, etc.

But the bigger issue during runtime is what Muchun pointed out that the struct page refcount is not frozen to 0.

My main usecase (and maybe for others as well?) is reserving these gigantic pages at boot time. I thought the runtime improvement might come from free with it but doesnt look like it. Both issues could be solved by just limiting it to boot time, as vmemmap_optimize_enabled cannot be changed during boot time, and reference to those pages cannot gotten by anything else as well (they aren't even initialized by memblock after patch 6), so will include the below diff to solve both the issues in next revision.


diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8434100f60ae..790842a6f978 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1956,7 +1956,8 @@ static bool prep_compound_gigantic_folio(struct folio *folio, struct hstate *h,
* Hence, reduce nr_pages to the pages that will be kept.
*/
if (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) &&
- vmemmap_should_optimize(h, &folio->page))
+ vmemmap_should_optimize(h, &folio->page) &&
+ system_state == SYSTEM_BOOTING)
nr_pages = HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page);
for (i = 0; i < nr_pages; i++) {

Thanks,
Usama