Re: [PATCH] mm: page_alloc: fix cma pageblock was stolen in rmqueue fallback

From: Vlastimil Babka
Date: Mon Sep 11 2023 - 18:00:04 EST


On 9/11/23 17:57, Johannes Weiner wrote:
> On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote:
>> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
>>
>> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> returns pages to the buddy list on PCP lock contention. However, for
>> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
>> been clobbered already for pages that are not being isolated. In
>> practice, this means that CMA pages may be returned to the wrong
>> buddy list. While this might be harmless in some cases as it is
>> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
>> and prevent a future CMA allocation. Lookup the PCP migratetype
>> against unconditionally if the PCP lock is contended.
>>
>> [lecopzer.chen@xxxxxxxxxxxx: CMA-specific fix]
>> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx>
>> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> ---
>> mm/page_alloc.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 452459836b71..4053c377fee8 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>> free_unref_page_commit(zone, pcp, page, migratetype, order);
>> pcp_spin_unlock(pcp);
>> } else {
>> - free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
>> + /*
>> + * The page migratetype may have been clobbered for types
>> + * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
>> + * must be rechecked.
>> + */
>> + free_one_page(zone, page, pfn, order,
>> + get_pcppage_migratetype(page), FPI_NONE);
>> }
>> pcp_trylock_finish(UP_flags);
>> }
>>
>
> I had sent a (similar) fix for this here:
>
> https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@xxxxxxxxxxx/
>
> The context wasn't CMA, but HIGHATOMIC pages going to the movable
> freelist. But the class of bug is the same: the migratetype tweaking
> really only applies to the pcplist, not the buddy slowpath; I added a
> local pcpmigratetype to make it more clear, and hopefully prevent bugs
> of this nature down the line.

Seems to be the cleanest solution to me, indeed.

> I'm just preparing v2 of the above series. Do you want me to break
> this change out and send it separately?

Works for me, if you combine the it with the information about what commit
that fixes, the CMA implications reported, and Cc stable.