Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages

From: Oscar Salvador
Date: Thu Feb 18 2021 - 06:29:49 EST


On Wed, Feb 17, 2021 at 04:00:11PM +0100, Michal Hocko wrote:
> Is this really necessary? dissolve_free_huge_page will take care of this
> and the race windown you are covering is really tiny.

Probably not, I was trying to shrink to race window as much as possible
but the call to dissolve_free_huge_page might be enough.

> > + nid = page_to_nid(page);
> > + spin_unlock(&hugetlb_lock);
> > +
> > + /*
> > + * Before dissolving the page, we need to allocate a new one,
> > + * so the pool remains stable.
> > + */
> > + new_page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL);
>
> wrt. fallback to other zones, I haven't realized that the primary
> usecase is a form of memory offlining (from virt-mem). I am not yet sure
> what the proper behavior is in that case but if breaking hugetlb pools,
> similar to the normal hotplug operation, is viable then this needs a
> special mode. We do not want a random alloc_contig_range user to do the
> same. So for starter I would go with __GFP_THISNODE here.

Ok, makes sense.
__GFP_THISNODE will not allow fallback to other node's zones.
Since we only allow the nid the page belongs to, nodemask should be
NULL, right?

> > + if (!h)
> > + /*
> > + * The page might have been dissolved from under our feet.
> > + * If that is the case, return success as if we dissolved it
> > + * ourselves.
> > + */
> > + return true;
>
> nit I would put the comment above the conditin for both cases. It reads
> more easily that way. At least without { }.

Yes, makes sense.

> Other than that I haven't noticed any surprises.

I did. The 'put_page' call should be placed above, right after getting
the page. Otherwise, refcount == 1 and we will fail to dissolve the
new page if we need to (in case old page fails to be dissolved).
I already fixed that locally.

--
Oscar Salvador
SUSE L3