Re: [External] Re: [PATCH v4 04/21] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate
From: Muchun Song
Date: Mon Nov 16 2020 - 10:42:11 EST
On Mon, Nov 16, 2020 at 9:33 PM Oscar Salvador <osalvador@xxxxxxx> wrote:
>
> On Fri, Nov 13, 2020 at 06:59:35PM +0800, Muchun Song wrote:
> > If the size of HugeTLB page is 2MB, we need 512 struct page structures
> > (8 pages) to be associated with it. As far as I know, we only use the
> > first 4 struct page structures. Use of first 4 struct page structures
> > comes from HUGETLB_CGROUP_MIN_ORDER.
>
> Once you mention 2MB HugeTLB page and its specific I would also mention
> 1GB HugeTLB pages, maybe something along these lines.
> I would supress "As far as I know", we __know__ that we only use
> the first 4 struct page structures to track metadata information.
Thanks. Will do.
>
> > +/*
> > + * There are 512 struct page structures(8 pages) associated with each 2MB
> > + * hugetlb page. For tail pages, the value of compound_head is the same.
> > + * So we can reuse first page of tail page structures. We map the virtual
> > + * addresses of the remaining 6 pages of tail page structures to the first
> > + * tail page struct, and then free these 6 pages. Therefore, we need to
> > + * reserve at least 2 pages as vmemmap areas.
> > + */
> > +#define RESERVE_VMEMMAP_NR 2U
>
> Either I would include 1GB specific there as well, or I would not add
> any specifics at all and just go by saying that first two pages are used,
> and the rest can be remapped to the first page that contains the tails.
Thanks. Will do.
>
>
> > +void __init hugetlb_vmemmap_init(struct hstate *h)
> > +{
> > + unsigned int order = huge_page_order(h);
> > + unsigned int vmemmap_pages;
> > +
> > + vmemmap_pages = ((1 << order) * sizeof(struct page)) >> PAGE_SHIFT;
> > + /*
> > + * The head page and the first tail page are not to be freed to buddy
> > + * system, the others page will map to the first tail page. So there
> "the remaining pages" might be more clear.
Thanks.
>
> > + * are (@vmemmap_pages - RESERVE_VMEMMAP_NR) pages can be freed.
> "that can be freed"
Thanks.
>
> > + *
> > + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? This is
> > + * not expected to happen unless the system is corrupted. So on the
> > + * safe side, it is only a safety net.
> > + */
> > + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR))
> > + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR;
> > + else
> > + h->nr_free_vmemmap_pages = 0;
>
> This made think of something.
> Since struct hstate hstates is global, all the fields should be defined to 0.
> So, the following assignments in hugetlb_add_hstate:
>
> h->nr_huge_pages = 0;
> h->free_huge_pages = 0;
>
> should not be needed.
> Actually, we do not initialize other values like resv_huge_pages
> or surplus_huge_pages.
>
> If that is the case, the "else" could go.
Yeah, I agree with you.
>
> Mike?
>
> The changes itself look good to me.
> I think that putting all the vemmap stuff into hugetlb-vmemmap.* was
> the right choice.
>
>
> --
> Oscar Salvador
> SUSE L3
--
Yours,
Muchun