Re: [PATCH v2] mm/filemap: avoid costly reclaim for high-order folio allocations

From: Andrew Morton

Date: Mon Apr 20 2026 - 14:04:21 EST

On Mon, 20 Apr 2026 16:14:03 +0000 Salvatore Dipietro <dipiets@xxxxxxxxx> wrote:

> Commit 5d8edfb900d5 ("iomap: Copy larger chunks from userspace")
> introduced high-order folio allocations in the buffered write path.
> When memory is fragmented, each failed allocation above
> PAGE_ALLOC_COSTLY_ORDER triggers compaction and drain_all_pages() via
> __alloc_pages_slowpath(), causing a 0.75x throughput drop on pgbench
> (simple-update) with 1024 clients on a 96-vCPU arm64 system.
>
> In __filemap_get_folio(), for orders above min_order, split the
> allocation behavior by cost:
>
> - For orders above PAGE_ALLOC_COSTLY_ORDER: strip
> __GFP_DIRECT_RECLAIM, making them purely opportunistic. The
> allocator tries the freelists only and returns NULL immediately if
> pages are not available.
>
> - For non-costly orders (between min_order and
> PAGE_ALLOC_COSTLY_ORDER): use __GFP_NORETRY to allow lightweight
> direct reclaim without expensive compaction retries.
>
> With this patch, pgbench throughput recovers to 148k TPS (+67% vs
> regressed baseline), stable across all iterations.

"Good money after bad"? Prove me wrong!

Instead of performing weird fragile hard-to-maintain party tricks with
the page allocator to work around the damage, plan B is to simply
revert 5d8edfb900d5.

5d8edfb900d5 came with no performance testing results. Does anyone
have any evidence that it improved anything? By how much?

> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2007,8 +2007,13 @@ struct folio *__filemap_get_folio_mpol(struct address_space *mapping,
> gfp_t alloc_gfp = gfp;
>
> err = -ENOMEM;
> - if (order > min_order)
> - alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN;
> + if (order > min_order) {
> + alloc_gfp |= __GFP_NOWARN;
> + if (order > PAGE_ALLOC_COSTLY_ORDER)
> + alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
> + else
> + alloc_gfp |= __GFP_NORETRY;
> + }
> folio = filemap_alloc_folio(alloc_gfp, order, policy);

I don't think it's reasonable to expect a reader to understand why this
code is as it is. Hence each clause here should have a comment
explaining why we're taking that step, please.

Look. I'm being grumpy. We know that patches which purportedly
improve performance must come with quality performance testing results.
How long have we been at this?