Re: [RFC PATCH 02/26] hugetlb: sort hstates in hugetlb_init_hstates

From: James Houghton
Date: Tue Jun 28 2022 - 11:35:51 EST


On Mon, Jun 27, 2022 at 5:09 AM manish.mishra <manish.mishra@xxxxxxxxxxx> wrote:
>
>
> On 24/06/22 11:06 pm, James Houghton wrote:
> > When using HugeTLB high-granularity mapping, we need to go through the
> > supported hugepage sizes in decreasing order so that we pick the largest
> > size that works. Consider the case where we're faulting in a 1G hugepage
> > for the first time: we want hugetlb_fault/hugetlb_no_page to map it with
> > a PUD. By going through the sizes in decreasing order, we will find that
> > PUD_SIZE works before finding out that PMD_SIZE or PAGE_SIZE work too.
> >
> > Signed-off-by: James Houghton <jthoughton@xxxxxxxxxx>
> > ---
> > mm/hugetlb.c | 40 +++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 37 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index a57e1be41401..5df838d86f32 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -33,6 +33,7 @@
> > #include <linux/migrate.h>
> > #include <linux/nospec.h>
> > #include <linux/delayacct.h>
> > +#include <linux/sort.h>
> >
> > #include <asm/page.h>
> > #include <asm/pgalloc.h>
> > @@ -48,6 +49,10 @@
> >
> > int hugetlb_max_hstate __read_mostly;
> > unsigned int default_hstate_idx;
> > +/*
> > + * After hugetlb_init_hstates is called, hstates will be sorted from largest
> > + * to smallest.
> > + */
> > struct hstate hstates[HUGE_MAX_HSTATE];
> >
> > #ifdef CONFIG_CMA
> > @@ -3144,14 +3149,43 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
> > kfree(node_alloc_noretry);
> > }
> >
> > +static int compare_hstates_decreasing(const void *a, const void *b)
> > +{
> > + const int shift_a = huge_page_shift((const struct hstate *)a);
> > + const int shift_b = huge_page_shift((const struct hstate *)b);
> > +
> > + if (shift_a < shift_b)
> > + return 1;
> > + if (shift_a > shift_b)
> > + return -1;
> > + return 0;
> > +}
> > +
> > +static void sort_hstates(void)
> > +{
> > + unsigned long default_hstate_sz = huge_page_size(&default_hstate);
> > +
> > + /* Sort from largest to smallest. */
> > + sort(hstates, hugetlb_max_hstate, sizeof(*hstates),
> > + compare_hstates_decreasing, NULL);
> > +
> > + /*
> > + * We may have changed the location of the default hstate, so we need to
> > + * update it.
> > + */
> > + default_hstate_idx = hstate_index(size_to_hstate(default_hstate_sz));
> > +}
> > +
> > static void __init hugetlb_init_hstates(void)
> > {
> > struct hstate *h, *h2;
> >
> > - for_each_hstate(h) {
> > - if (minimum_order > huge_page_order(h))
> > - minimum_order = huge_page_order(h);
> > + sort_hstates();
> >
> > + /* The last hstate is now the smallest. */
> > + minimum_order = huge_page_order(&hstates[hugetlb_max_hstate - 1]);
> > +
> > + for_each_hstate(h) {
> > /* oversize hugepages were init'ed in early boot */
> > if (!hstate_is_gigantic(h))
> > hugetlb_hstate_alloc_pages(h);
>
> As now hstates are ordered can code which does calculation of demot_order
>
> can too be optimised, i mean it can be value of order of hstate at next index?
>

Indeed -- thanks for catching that. I'll make this optimization for
the next version of this series.

>