Re: [PATCH] mm: page_alloc: fix cma pageblock was stolen in rmqueue fallback

From: Vlastimil Babka
Date: Mon Sep 11 2023 - 18:07:21 EST


On 9/11/23 20:11, Johannes Weiner wrote:
> On Mon, Sep 11, 2023 at 06:13:59PM +0200, Vlastimil Babka wrote:
>> On 9/11/23 17:57, Johannes Weiner wrote:
>> > On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote:
>> >> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
>> >>
>> >> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> >> returns pages to the buddy list on PCP lock contention. However, for
>> >> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
>> >> been clobbered already for pages that are not being isolated. In
>> >> practice, this means that CMA pages may be returned to the wrong
>> >> buddy list. While this might be harmless in some cases as it is
>> >> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
>> >> and prevent a future CMA allocation. Lookup the PCP migratetype
>> >> against unconditionally if the PCP lock is contended.
>> >>
>> >> [lecopzer.chen@xxxxxxxxxxxx: CMA-specific fix]
>> >> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> >> Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx>
>> >> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> >> ---
>> >> mm/page_alloc.c | 8 +++++++-
>> >> 1 file changed, 7 insertions(+), 1 deletion(-)
>> >>
>> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> >> index 452459836b71..4053c377fee8 100644
>> >> --- a/mm/page_alloc.c
>> >> +++ b/mm/page_alloc.c
>> >> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>> >> free_unref_page_commit(zone, pcp, page, migratetype, order);
>> >> pcp_spin_unlock(pcp);
>> >> } else {
>> >> - free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
>> >> + /*
>> >> + * The page migratetype may have been clobbered for types
>> >> + * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
>> >> + * must be rechecked.
>> >> + */
>> >> + free_one_page(zone, page, pfn, order,
>> >> + get_pcppage_migratetype(page), FPI_NONE);
>> >> }
>> >> pcp_trylock_finish(UP_flags);
>> >> }
>> >>
>> >
>> > I had sent a (similar) fix for this here:
>> >
>> > https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@xxxxxxxxxxx/
>> >
>> > The context wasn't CMA, but HIGHATOMIC pages going to the movable
>> > freelist. But the class of bug is the same: the migratetype tweaking
>> > really only applies to the pcplist, not the buddy slowpath; I added a
>> > local pcpmigratetype to make it more clear, and hopefully prevent bugs
>> > of this nature down the line.
>>
>> Seems to be the cleanest solution to me, indeed.
>>
>> > I'm just preparing v2 of the above series. Do you want me to break
>> > this change out and send it separately?
>>
>> Works for me, if you combine the it with the information about what commit
>> that fixes, the CMA implications reported, and Cc stable.
>
> How about this? Based on v6.6-rc1.
>
> ---
>
> From 84e4490095ed3d1f2991e7f0e58e2968e56cc7c0 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@xxxxxxxxxxx>
> Date: Fri, 28 Jul 2023 14:29:41 -0400
> Subject: [PATCH] mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong
> buddy list
>
> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
> spinlock") bypasses the pcplist on lock contention and returns the
> page directly to the buddy list of the page's migratetype.
>
> For pages that don't have their own pcplist, such as CMA and
> HIGHATOMIC, the migratetype is temporarily updated such that the page
> can hitch a ride on the MOVABLE pcplist. Their true type is later
> reassessed when flushing in free_pcppages_bulk(). However, when lock
> contention is detected after the type was already overriden, the
> bypass will then put the page on the wrong buddy list.
>
> Once on the MOVABLE buddy list, the page becomes eligible for
> fallbacks and even stealing. In the case of HIGHATOMIC, otherwise
> ineligible allocations can dip into the highatomic reserves. In the
> case of CMA, the page can be lost from the CMA region permanently.
>
> Use a separate pcpmigratetype variable for the pcplist override. Use
> the original migratetype when going directly to the buddy. This fixes
> the bug and should make the intentions more obvious in the code.
>
> Originally sent here to address the HIGHATOMIC case:
> https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@xxxxxxxxxxx/
>
> Changelog updated in response to the CMA-specific bug report.
>
> [mgorman@xxxxxxxxxxxxxxxxxxx: updated changelog]
> Reported-by: Joe Liu <joe.liu@xxxxxxxxxxxx>
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>

Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx>

> ---
> mm/page_alloc.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0c5be12f9336..95546f376302 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2400,7 +2400,7 @@ void free_unref_page(struct page *page, unsigned int order)
> struct per_cpu_pages *pcp;
> struct zone *zone;
> unsigned long pfn = page_to_pfn(page);
> - int migratetype;
> + int migratetype, pcpmigratetype;
>
> if (!free_unref_page_prepare(page, pfn, order))
> return;
> @@ -2408,24 +2408,24 @@ void free_unref_page(struct page *page, unsigned int order)
> /*
> * We only track unmovable, reclaimable and movable on pcp lists.
> * Place ISOLATE pages on the isolated list because they are being
> - * offlined but treat HIGHATOMIC as movable pages so we can get those
> - * areas back if necessary. Otherwise, we may have to free
> + * offlined but treat HIGHATOMIC and CMA as movable pages so we can
> + * get those areas back if necessary. Otherwise, we may have to free
> * excessively into the page allocator
> */
> - migratetype = get_pcppage_migratetype(page);
> + migratetype = pcpmigratetype = get_pcppage_migratetype(page);
> if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
> if (unlikely(is_migrate_isolate(migratetype))) {
> free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
> return;
> }
> - migratetype = MIGRATE_MOVABLE;
> + pcpmigratetype = MIGRATE_MOVABLE;
> }
>
> zone = page_zone(page);
> pcp_trylock_prepare(UP_flags);
> pcp = pcp_spin_trylock(zone->per_cpu_pageset);
> if (pcp) {
> - free_unref_page_commit(zone, pcp, page, migratetype, order);
> + free_unref_page_commit(zone, pcp, page, pcpmigratetype, order);
> pcp_spin_unlock(pcp);
> } else {
> free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);