Re: [PATCH] mm/compaction: respect cpusets when checking retry suitability
From: Vlastimil Babka (SUSE)
Date: Thu May 28 2026 - 14:24:30 EST
On 5/26/26 14:22, fujunjie wrote:
> should_compact_retry() handles COMPACT_SKIPPED by asking
> compaction_zonelist_suitable() whether reclaim can make a later
> compaction attempt worthwhile. That answer is used for the current
> allocation, so it should follow the same zone eligibility rules as the
> allocation itself.
>
> When cpusets are enabled, allocator slowpath decisions are marked with
> ALLOC_CPUSET. The allocation path, direct compaction and reclaim retry
> all skip zones rejected by __cpuset_zone_allowed().
>
> compaction_zonelist_suitable() does not apply that filter. It only walks
> ac->zonelist/ac->nodemask, so it can return true because a zone that is
> not usable for the current allocation would pass __compaction_suitable().
>
> That does not let the allocation use the disallowed zone. Later
> allocation and direct compaction paths still apply cpuset filtering.
> However, it can make should_compact_retry() retry based on memory that
> this allocation cannot use.
>
> Pass gfp_mask down and apply the same ALLOC_CPUSET check in
> compaction_zonelist_suitable(). This keeps the retry decision aligned
> with the zones that the allocation is allowed to use.
Nice find.
> A temporary debugfs probe was also used to call the old and new
> compaction_zonelist_suitable() predicates in the same two-node NUMA guest.
> The task was restricted to mems=0 while ac->nodemask covered nodes 0-1.
> After putting pressure on node0, node0 failed __compaction_suitable() for
> order-10 and node1 passed it, but node1 was rejected by
> __cpuset_zone_allowed(). In that state the old predicate returned true and
> the patched predicate returned false.
Nice that you verified it like this.
> Signed-off-by: fujunjie <fujunjie1@xxxxxx>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@xxxxxxxxxx>
Also probably this? That commit introduced the other cpuset checks for
direct compaction.
Fixes: 435b3894e742 ("mm:page_alloc: fix the NULL ac->nodemask in
__alloc_pages_slowpath()")
(not stable material though)
Thanks!
> ---
> include/linux/compaction.h | 2 +-
> mm/compaction.c | 6 +++++-
> mm/page_alloc.c | 15 +++++++++------
> 3 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index 173d9c07a895..c829c48d1c71 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -101,7 +101,7 @@ extern void compaction_defer_reset(struct zone *zone, int order,
> bool alloc_success);
>
> bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
> - int alloc_flags);
> + int alloc_flags, gfp_t gfp_mask);
>
> extern void __meminit kcompactd_run(int nid);
> extern void __meminit kcompactd_stop(int nid);
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 3648ce22c807..2295b2487dfc 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2447,7 +2447,7 @@ bool compaction_suitable(struct zone *zone, int order, unsigned long watermark,
>
> /* Used by direct reclaimers */
> bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
> - int alloc_flags)
> + int alloc_flags, gfp_t gfp_mask)
> {
> struct zone *zone;
> struct zoneref *z;
> @@ -2460,6 +2460,10 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
> ac->highest_zoneidx, ac->nodemask) {
> unsigned long available;
>
> + if (cpusets_enabled() && (alloc_flags & ALLOC_CPUSET) &&
> + !__cpuset_zone_allowed(zone, gfp_mask))
> + continue;
> +
> /*
> * Do not consider all the reclaimable memory because we do not
> * want to trash just for a single high order allocation which
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ddeb79fa12db..93d56ba339fb 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4199,7 +4199,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> }
>
> static inline bool
> -should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> +should_compact_retry(gfp_t gfp_mask, struct alloc_context *ac, int order,
> + int alloc_flags,
> enum compact_result compact_result,
> enum compact_priority *compact_priority,
> int *compaction_retries)
> @@ -4221,7 +4222,8 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> * migration targets. Continue if reclaim can help.
> */
> if (compact_result == COMPACT_SKIPPED) {
> - ret = compaction_zonelist_suitable(ac, order, alloc_flags);
> + ret = compaction_zonelist_suitable(ac, order, alloc_flags,
> + gfp_mask);
> goto out;
> }
>
> @@ -4274,7 +4276,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> }
>
> static inline bool
> -should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> +should_compact_retry(gfp_t gfp_mask, struct alloc_context *ac, int order,
> + int alloc_flags,
> enum compact_result compact_result,
> enum compact_priority *compact_priority,
> int *compaction_retries)
> @@ -4892,9 +4895,9 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> * of free memory (see __compaction_suitable)
> */
> if (did_some_progress > 0 && can_compact &&
> - should_compact_retry(ac, order, alloc_flags,
> - compact_result, &compact_priority,
> - &compaction_retries))
> + should_compact_retry(gfp_mask, ac, order, alloc_flags,
> + compact_result, &compact_priority,
> + &compaction_retries))
> goto retry;
>
> /* Reclaim/compaction failed to prevent the fallback */
>
> base-commit: e8c2f9fdadee7cbc75134dc463c1e0d856d6e5c7