Re: [PATCH] mm: page_alloc: fix cma pageblock was stolen in rmqueue fallback

From: Vlastimil Babka
Date: Tue Sep 05 2023 - 12:46:20 EST


On 9/5/23 11:09, Mel Gorman wrote:
> On Wed, Aug 30, 2023 at 07:13:33PM +0800, Lecopzer Chen wrote:
>> commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
>> spinlock") fallback freeing page to free_one_page() if pcp trylock
>> failed. This make MIGRATE_CMA be able to fallback and be stolen
>> whole pageblock by MIGRATE_UNMOVABLE in the page allocation.
>>
>> PCP free is fine because free_pcppages_bulk() will always get
>> migratetype again before freeing the page, thus this only happen when
>> someone tried to put CMA page in to other MIGRATE_TYPE's freelist.
>>
>> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx>
>> Signed-off-by: Lecopzer Chen <lecopzer.chen@xxxxxxxxxxxx>
>> Cc: Mark-pk Tsai <mark-pk.tsai@xxxxxxxxxxxx>
>> Cc: Joe Liu <joe.liu@xxxxxxxxxxxx>
>
> Sorry for the long delay and thanks Lecopzer for the patch.
>
> This changelog is difficult to parse but the fix may also me too specific
> and could be more robust against types other than CMA. It is true that
> a failed PCP acquire may return a !is_migrate_isolate page to the wrong
> list but it's more straight-forward to unconditionally lookup the PCP
> migratetype of the spinlock is not acquired.
>
> How about this? It unconditionally looks up the PCP migratetype after
> spinlock contention. It's build tested only
>
> --8<--
> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
>
> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> returns pages to the buddy list on PCP lock contention. However, for
> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
> been clobbered already for pages that are not being isolated. In
> practice, this means that CMA pages may be returned to the wrong
> buddy list. While this might be harmless in some cases as it is
> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
> and prevent a future CMA allocation. Lookup the PCP migratetype
> against unconditionally if the PCP lock is contended.
>
> [lecopzer.chen@xxxxxxxxxxxx: CMA-specific fix]
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")

I think we should Cc: stable for the sake of 6.1 LTS?

> Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx>
> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>

Acked-by: Vlastimil Babka <vbabka@xxxxxxx>

> ---
> mm/page_alloc.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 452459836b71..4053c377fee8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
> free_unref_page_commit(zone, pcp, page, migratetype, order);
> pcp_spin_unlock(pcp);
> } else {
> - free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
> + /*
> + * The page migratetype may have been clobbered for types
> + * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
> + * must be rechecked.
> + */
> + free_one_page(zone, page, pfn, order,
> + get_pcppage_migratetype(page), FPI_NONE);
> }
> pcp_trylock_finish(UP_flags);
> }