RE: [PATCH v3] mm/hugetlb: avoid hardcoding while checking if cma is enable

From: Song Bao Hua (Barry Song)
Date: Wed Jul 08 2020 - 18:11:28 EST




> -----Original Message-----
> From: Roman Gushchin [mailto:guro@xxxxxx]
> Sent: Thursday, July 9, 2020 6:46 AM
> To: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Song Bao Hua (Barry Song)
> <song.bao.hua@xxxxxxxxxxxxx>; linux-mm@xxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; Jonathan
> Cameron <jonathan.cameron@xxxxxxxxxx>
> Subject: Re: [PATCH v3] mm/hugetlb: avoid hardcoding while checking if cma
> is enable
>
> On Wed, Jul 08, 2020 at 10:45:16AM -0700, Mike Kravetz wrote:
> > On 7/7/20 12:56 PM, Andrew Morton wrote:
> > > On Tue, 7 Jul 2020 16:02:04 +1200 Barry Song
> <song.bao.hua@xxxxxxxxxxxxx> wrote:
> > >
> > >> hugetlb_cma[0] can be NULL due to various reasons, for example, node0
> has
> > >> no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
> > >> enabled. gigantic pages might have been reserved on other nodes.
> > >
> > > I'm trying to figure out whether this should be backported into 5.7.1,
> > > but the changelog doesn't describe any known user-visible effects of
> > > the bug. Are there any?
> >
> > Barry must have missed this email. He reported the issue so I was hoping
> > he would reply.

Yep. it should be better to backport it into 5.7. it doesn't cause serious crash or failure,
but could cause double reservation or cma leak.

> >
> > Based on the code changes, I believe the following could happen:
> > - Someone uses 'hugetlb_cma=' kernel command line parameter to reserve
> > CMA for gigantic pages.
> > - The system topology is such that no memory is on node 0. Therefore,
> > no CMA can be reserved for gigantic pages on node 0. CMA is reserved
> > on other nodes.
> > - The user also specifies a number of gigantic pages to pre-allocate on
> > the command line with hugepagesz=<gigantic_page_size> hugepages=<N>
> > - The routine which allocates gigantic pages from the bootmem allocator
> > will not detect CMA has been reserved as there is no memory on node 0.
> > Therefore, pages will be pre-allocated from bootmem allocator as well
> > as reserved in CMA.
> >
> > This double allocation (bootmem and CMA) is the worst case scenario. Not
> > sure if this is what Barry saw, and I suspect this would rarely happen.
> >
> > After writing this, I started to think that perhaps command line parsing
> > should be changed. If hugetlb_cma= is specified, it makes no sense to
> > pre-allocate gigantic pages. Therefore, the hugepages=<N> paramemter
> > should be ignored and flagged with a warning if hugetlb_cma= is specified.
> > This could be checked at parsing time and there would be no need for such
> > a check in the allocation code (except for sanity cheching).
> >
> > Thoughts? I just cleaned up the parsing code and could make such a
> change
> > quite easily.
>
> I agree. Basically, if hugetlb_cma_size > 0, we should not pre-allocate
> gigantic pages. It would be much simpler and more reliable than the existing
> code.

I agree this is a better solution, if hugetlb_cma has higher priority than bootmem gigantic pages,
we should document it.

>
> Thank you!

Thanks
Barry