Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block conversion

From: Baolin Wang
Date: Sun Apr 07 2024 - 02:58:23 EST




On 2024/4/6 00:56, Johannes Weiner wrote:
Hi Baolin,

On Fri, Apr 05, 2024 at 08:11:47PM +0800, Baolin Wang wrote:
On 2024/3/21 02:02, Johannes Weiner wrote:
@@ -2127,15 +2165,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
return page;
}
}
-retry:
+
page = __rmqueue_smallest(zone, order, migratetype);
if (unlikely(!page)) {
if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
-
- if (!page && __rmqueue_fallback(zone, order, migratetype,
- alloc_flags))
- goto retry;
+ else
+ page = __rmqueue_fallback(zone, order, migratetype,
+ alloc_flags);

(Sorry for chim in late.)

I have some doubts about the changes here. The original code logic was
that if the 'migratetype' type allocation is failed, it would first try
CMA page allocation and then attempt to fallback to other migratetype
allocations. Now it has been changed so that if CMA allocation fails, it
will directly return. This change has caused a regression when I running
the thpcompact benchmark, resulting in a significant reduction in the
percentage of THPs like below:

thpcompact Percentage Faults Huge
K6.9-rc2 K6.9-rc2 + this patch
Percentage huge-1 78.18 ( 0.00%) 42.49 ( -45.65%)
Percentage huge-3 86.70 ( 0.00%) 35.13 ( -59.49%)
Percentage huge-5 90.26 ( 0.00%) 52.35 ( -42.00%)
Percentage huge-7 92.34 ( 0.00%) 31.84 ( -65.52%)
Percentage huge-12 91.18 ( 0.00%) 45.85 ( -49.72%)
Percentage huge-18 89.00 ( 0.00%) 29.18 ( -67.22%)
Percentage huge-24 90.52 ( 0.00%) 46.68 ( -48.43%)
Percentage huge-30 94.44 ( 0.00%) 38.35 ( -59.39%)
Percentage huge-32 93.09 ( 0.00%) 39.37 ( -57.70%)

Ouch, sorry about that! I changed that specific part around later
during development and didn't retest with CMA. I'll be sure to
re-enable it again in my config.

After making the following modifications, the regression is gone.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ce67dc6777fa..a7cfe65e45c1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2139,7 +2139,8 @@ __rmqueue(struct zone *zone, unsigned int order,
int migratetype,
if (unlikely(!page)) {
if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
- else
+
+ if (!page)
page = __rmqueue_fallback(zone, order, migratetype,
alloc_flags);
}

But I am not sure your original change is intentional? IIUC, we still
need try fallbacking even though CMA allocation is failed, please
correct me if I misunderstand your code. Thanks.

No, this was accidental. I missed that CMA dependency when changing
things around for the new return type of __rmqueue_fallback(). Your
fix is good: just because the request qualifies for CMA doesn't mean
it will succeed from that region. We need the fallback for those.

OK. Thanks for the clarification.

Andrew, could you please pick up Baolin's change for this patch?

[baolin.wang@xxxxxxxxxxxxxxxxx: fix allocation failures with CONFIG_CMA]

Thanks for debugging this and the fix, Baolin.

My pleasure.