Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block conversion

From: Johannes Weiner
Date: Fri Apr 05 2024 - 12:57:01 EST


Hi Baolin,

On Fri, Apr 05, 2024 at 08:11:47PM +0800, Baolin Wang wrote:
> On 2024/3/21 02:02, Johannes Weiner wrote:
> > @@ -2127,15 +2165,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
> > return page;
> > }
> > }
> > -retry:
> > +
> > page = __rmqueue_smallest(zone, order, migratetype);
> > if (unlikely(!page)) {
> > if (alloc_flags & ALLOC_CMA)
> > page = __rmqueue_cma_fallback(zone, order);
> > -
> > - if (!page && __rmqueue_fallback(zone, order, migratetype,
> > - alloc_flags))
> > - goto retry;
> > + else
> > + page = __rmqueue_fallback(zone, order, migratetype,
> > + alloc_flags);
>
> (Sorry for chim in late.)
>
> I have some doubts about the changes here. The original code logic was
> that if the 'migratetype' type allocation is failed, it would first try
> CMA page allocation and then attempt to fallback to other migratetype
> allocations. Now it has been changed so that if CMA allocation fails, it
> will directly return. This change has caused a regression when I running
> the thpcompact benchmark, resulting in a significant reduction in the
> percentage of THPs like below:
>
> thpcompact Percentage Faults Huge
> K6.9-rc2 K6.9-rc2 + this patch
> Percentage huge-1 78.18 ( 0.00%) 42.49 ( -45.65%)
> Percentage huge-3 86.70 ( 0.00%) 35.13 ( -59.49%)
> Percentage huge-5 90.26 ( 0.00%) 52.35 ( -42.00%)
> Percentage huge-7 92.34 ( 0.00%) 31.84 ( -65.52%)
> Percentage huge-12 91.18 ( 0.00%) 45.85 ( -49.72%)
> Percentage huge-18 89.00 ( 0.00%) 29.18 ( -67.22%)
> Percentage huge-24 90.52 ( 0.00%) 46.68 ( -48.43%)
> Percentage huge-30 94.44 ( 0.00%) 38.35 ( -59.39%)
> Percentage huge-32 93.09 ( 0.00%) 39.37 ( -57.70%)

Ouch, sorry about that! I changed that specific part around later
during development and didn't retest with CMA. I'll be sure to
re-enable it again in my config.

> After making the following modifications, the regression is gone.
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ce67dc6777fa..a7cfe65e45c1 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2139,7 +2139,8 @@ __rmqueue(struct zone *zone, unsigned int order,
> int migratetype,
> if (unlikely(!page)) {
> if (alloc_flags & ALLOC_CMA)
> page = __rmqueue_cma_fallback(zone, order);
> - else
> +
> + if (!page)
> page = __rmqueue_fallback(zone, order, migratetype,
> alloc_flags);
> }
>
> But I am not sure your original change is intentional? IIUC, we still
> need try fallbacking even though CMA allocation is failed, please
> correct me if I misunderstand your code. Thanks.

No, this was accidental. I missed that CMA dependency when changing
things around for the new return type of __rmqueue_fallback(). Your
fix is good: just because the request qualifies for CMA doesn't mean
it will succeed from that region. We need the fallback for those.

Andrew, could you please pick up Baolin's change for this patch?

[baolin.wang@xxxxxxxxxxxxxxxxx: fix allocation failures with CONFIG_CMA]

Thanks for debugging this and the fix, Baolin.