Re: [RFC PATCH] mm: hugetlb: remove __GFP_THISNODE flag when dissolving the old hugetlb

From: Michal Hocko
Date: Tue Feb 06 2024 - 08:20:08 EST


On Tue 06-02-24 16:18:22, Baolin Wang wrote:
>
>
> On 2024/2/5 22:23, Michal Hocko wrote:
> > On Mon 05-02-24 21:06:17, Baolin Wang wrote:
> > [...]
> > > > It is quite possible that traditional users (like large DBs) do not use
> > > > CMA heavily so such a problem was not observed so far. That doesn't mean
> > > > those problems do not really matter.
> > >
> > > CMA is just one case, as I mentioned before, other situations can also break
> > > the per-node hugetlb pool now.
> >
> > Is there any other case than memory hotplug which is arguably different
> > as it is a disruptive operation already.
>
> Yes, like I said before the longterm pinning, memory failure and the users
> of alloc_contig_pages() may also break the per-node hugetlb pool.

memory failure is similar to the memory hotplug in the sense that it is
a disruptive operation and fallback to a different node might be the
only option to handle it. On the other hand longterm pinning is similar to
a_c_p and it should fail if it cannot be migrated within the node.

It seems that hugetlb is quite behind with many other features and I am
not really sure how to deal with that. What is your take Munchun Song?

> > > Let's focus on the main point, why we should still keep inconsistency
> > > behavior to handle free and in-use hugetlb for alloc_contig_range()? That's
> > > really confused.
> >
> > yes, this should behave consistently. And the least surprising way to
> > handle that from the user configuration POV is to not move outside of
> > the original NUMA node.
>
> So you mean we should also add __GFP_THISNODE flag in
> alloc_migration_target() when allocating a new hugetlb as the target for
> migration, that can unify the behavior and avoid breaking the per-node pool?

Not as simple as that, because alloc_migration_target is used also from
an user driven migration.
--
Michal Hocko
SUSE Labs