Re: [PATCH 06/10] mm: page_alloc: fix freelist movement during block conversion

From: Vlastimil Babka
Date: Tue Apr 09 2024 - 02:22:14 EST


On 4/5/24 6:56 PM, Johannes Weiner wrote:
> Hi Baolin,
>
> On Fri, Apr 05, 2024 at 08:11:47PM +0800, Baolin Wang wrote:
>> On 2024/3/21 02:02, Johannes Weiner wrote:
>> > @@ -2127,15 +2165,14 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
>> > return page;
>> > }
>> > }
>> > -retry:
>> > +
>> > page = __rmqueue_smallest(zone, order, migratetype);
>> > if (unlikely(!page)) {
>> > if (alloc_flags & ALLOC_CMA)
>> > page = __rmqueue_cma_fallback(zone, order);
>> > -
>> > - if (!page && __rmqueue_fallback(zone, order, migratetype,
>> > - alloc_flags))
>> > - goto retry;
>> > + else
>> > + page = __rmqueue_fallback(zone, order, migratetype,
>> > + alloc_flags);
>>
>> (Sorry for chim in late.)
>>
>> I have some doubts about the changes here. The original code logic was
>> that if the 'migratetype' type allocation is failed, it would first try
>> CMA page allocation and then attempt to fallback to other migratetype
>> allocations. Now it has been changed so that if CMA allocation fails, it
>> will directly return. This change has caused a regression when I running
>> the thpcompact benchmark, resulting in a significant reduction in the
>> percentage of THPs like below:
>>
>> thpcompact Percentage Faults Huge
>> K6.9-rc2 K6.9-rc2 + this patch
>> Percentage huge-1 78.18 ( 0.00%) 42.49 ( -45.65%)
>> Percentage huge-3 86.70 ( 0.00%) 35.13 ( -59.49%)
>> Percentage huge-5 90.26 ( 0.00%) 52.35 ( -42.00%)
>> Percentage huge-7 92.34 ( 0.00%) 31.84 ( -65.52%)
>> Percentage huge-12 91.18 ( 0.00%) 45.85 ( -49.72%)
>> Percentage huge-18 89.00 ( 0.00%) 29.18 ( -67.22%)
>> Percentage huge-24 90.52 ( 0.00%) 46.68 ( -48.43%)
>> Percentage huge-30 94.44 ( 0.00%) 38.35 ( -59.39%)
>> Percentage huge-32 93.09 ( 0.00%) 39.37 ( -57.70%)
>
> Ouch, sorry about that! I changed that specific part around later
> during development and didn't retest with CMA. I'll be sure to
> re-enable it again in my config.
>
>> After making the following modifications, the regression is gone.
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index ce67dc6777fa..a7cfe65e45c1 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2139,7 +2139,8 @@ __rmqueue(struct zone *zone, unsigned int order,
>> int migratetype,
>> if (unlikely(!page)) {
>> if (alloc_flags & ALLOC_CMA)
>> page = __rmqueue_cma_fallback(zone, order);
>> - else
>> +
>> + if (!page)
>> page = __rmqueue_fallback(zone, order, migratetype,
>> alloc_flags);
>> }
>>
>> But I am not sure your original change is intentional? IIUC, we still
>> need try fallbacking even though CMA allocation is failed, please
>> correct me if I misunderstand your code. Thanks.
>
> No, this was accidental. I missed that CMA dependency when changing
> things around for the new return type of __rmqueue_fallback(). Your
> fix is good: just because the request qualifies for CMA doesn't mean
> it will succeed from that region. We need the fallback for those.
>
> Andrew, could you please pick up Baolin's change for this patch?
>
> [baolin.wang@xxxxxxxxxxxxxxxxx: fix allocation failures with CONFIG_CMA]

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

> Thanks for debugging this and the fix, Baolin.