Re: [PATCH 1/1] iomap: avoid compaction for costly folio order allocation
From: Salvatore Dipietro
Date: Wed May 06 2026 - 08:34:19 EST
On 5/03/26 05:52, Ritesh Harjani wrote:
> Also as per the documentation [1], huge_pages=try option is the default
> setting. So I am assuming in production we at least won't suffer from
> this memory fragmentation, correct?
Yes, huge_pages=try is the default option, but without pre-allocating the
entire shared_buffer size in memory via "vm.nr_hugepages" — which is not
done automatically — huge pages will not be used and the system falls into
the huge_pages=off category. Even with a partial pre-allocation, PostgreSQL
will not be able to use hugepages.
On 5/03/26 11:55, Matthew Wilcox wrote:
> or we need more understandable GFP flags. Or the page allocator could
> use the __GFP_NORETRY flag to say "oh well, this allocation has a fallback,
> I'll kick kcompactd to try to compact some more memory, but I'll fail
> the allocation".
We also tested kicking off kcompactd in the background when __GFP_NORETRY is
passed, returning "nopage" to avoid blocking the folio allocation request.
Here is the patch tested as the other with PREEMPT_NONE patch [1]:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 65e205111553..d4f322910992 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4818,6 +4818,26 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (current->flags & PF_MEMALLOC)
goto nopage;
+ /*
+ * Costly allocations with __GFP_NORETRY are opportunistic - Don't
+ * stall on direct compaction or reclaim; instead, kick
+ * kcompactd on the preferred node so large pages may become
+ * available for future allocations and let the caller fall back now.
+ *
+ * Direct compaction is way too costly for hot allocation paths on
+ * large systems: each attempt calls drain_all_pages() which IPIs
+ * every CPU. Only wake kcompactd on the local node to avoid
+ * cross-NUMA interference with unrelated workloads.
+ */
+ if (costly_order && (gfp_mask & __GFP_NORETRY)) {
+ struct zone *preferred_zone = ac->preferred_zoneref->zone;
+
+ if (preferred_zone)
+ wakeup_kcompactd(preferred_zone->zone_pgdat, order,
+ ac->highest_zoneidx);
+ goto nopage;
+ }
+
/* Try direct reclaim and then allocating */
if (!compact_first) {
page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags,
Here are the results we collected (kcompactd background):
| Patch | Run 1 | Run 2 | Run 3 | Average | % vs Baseline |
|----------------------|-----------:|-----------:|-----------:|------------:|:-------------:|
| Baseline | 107,064.61 | 97,043.86 | 101,830.78 | 101,979.75 | — |
| Proposed patch | 146,012.23 | 136,392.36 | 141,178.00 | 141,194.20 | +38.45% |
| Ritesh's suggestion | 147,481.50 | 133,069.03 | 137,051.30 | 139,200.61 | +36.50% |
| Matthew's suggestion | 145,653.91 | 144,169.24 | 141,768.31 | 143,863.82 | +41.07% |
| kcompactd background | 146,760.75 | 128,094.92 | 127,979.74 | 134,278.47 | +31.67% |
[1] https://lore.kernel.org/all/20260403191942.21410-1-dipiets@xxxxxxxxx/T/#m8baeeaf48aa7ae5342c8c2db8f4e1c27e03c1368
AMAZON DEVELOPMENT CENTER ITALY SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2504859, Capitale Sociale: 10.000 EUR i.v., Cod. Fisc. e P.IVA 10100050961, Societa con Socio Unico