Re: [PATCH v3 0/5] hugetlb: add support gigantic page allocation at runtime

From: Luiz Capitulino
Date: Fri Apr 25 2014 - 16:19:20 EST


On Tue, 22 Apr 2014 14:55:46 -0700
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 22 Apr 2014 17:37:26 -0400 Luiz Capitulino <lcapitulino@xxxxxxxxxx> wrote:
>
> > On Thu, 17 Apr 2014 16:01:10 -0700
> > Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Thu, 10 Apr 2014 13:58:40 -0400 Luiz Capitulino <lcapitulino@xxxxxxxxxx> wrote:
> > >
> > > > The HugeTLB subsystem uses the buddy allocator to allocate hugepages during
> > > > runtime. This means that hugepages allocation during runtime is limited to
> > > > MAX_ORDER order. For archs supporting gigantic pages (that is, page sizes
> > > > greater than MAX_ORDER), this in turn means that those pages can't be
> > > > allocated at runtime.
> > >
> > > Dumb question: what's wrong with just increasing MAX_ORDER?
> >
> > To be honest I'm not a buddy allocator expert and I'm not familiar with
> > what is involved in increasing MAX_ORDER. What I do know though is that it's
> > not just a matter of increasing a macro's value. For example, for sparsemem
> > support we have this check (include/linux/mmzone.h:1084):
> >
> > #if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
> > #error Allocator MAX_ORDER exceeds SECTION_SIZE
> > #endif
> >
> > I _guess_ it's because we can't allocate more pages than what's within a
> > section on sparsemem. Can sparsemem and the other stuff be changed to
> > accommodate a bigger MAX_ORDER? I don't know. Is it worth it to increase
> > MAX_ORDER and do all the required changes, given that a bigger MAX_ORDER is
> > only useful for HugeTLB and the archs supporting gigantic pages? I'd guess not.
>
> afacit we'd need to increase SECTION_SIZE_BITS to 29 or more to
> accommodate 1G MAX_ORDER. I assume this means that some machines with
> sparse physical memory layout may not be able to use all (or as much)
> of the physical memory. Perhaps Yinghai can advise?

Yinghai?

> I do think we should fully explore this option before giving up and
> adding new special-case code.

I'll look into that, but it may take a bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/