Re: [RFC PATCH 4/5] mm: swap: fall back to order-0 after large swapin races

From: David Hildenbrand (Arm)

Date: Mon May 11 2026 - 09:12:09 EST


On 5/8/26 22:20, fujunjie wrote:
> swapin_folio() documents that a large folio insertion race returns NULL
> so the caller can fall back to order-0 swapin. do_swap_page() currently
> turns that NULL into VM_FAULT_OOM if the PTE is unchanged, which is
> harsher than necessary and gets in the way of rejecting large folio
> ranges for backend reasons.
>
> Move the synchronous swapin sequence into a helper and retry with an
> order-0 folio when a large folio cannot be inserted into the swap cache.
> Count the event as an mTHP swapin fallback before dropping the failed
> large allocation.
>
> Signed-off-by: fujunjie <fujunjie1@xxxxxx>
> ---
> mm/memory.c | 50 +++++++++++++++++++++++++++++++++++++++-----------
> 1 file changed, 39 insertions(+), 11 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index ea6568571131..84e3b77b8293 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4757,6 +4757,44 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf)
> }
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> +static struct folio *swapin_synchronous_folio(swp_entry_t entry,
> + struct vm_fault *vmf)
> +{
> + struct folio *swapcache, *folio;
> + bool large;
> + int order;
> +
> + folio = alloc_swap_folio(vmf);
> + if (!folio)
> + return NULL;
> +
> + large = folio_test_large(folio);
> + order = folio_order(folio);
> +
> + /*
> + * folio is charged, so swapin can only fail due to raced swapin and
> + * return NULL.
> + */
> + swapcache = swapin_folio(entry, folio);
> + if (swapcache == folio)
> + return folio;
> +
> + if (!swapcache && large)
> + count_mthp_stat(order, MTHP_STAT_SWPIN_FALLBACK);
> + folio_put(folio);
> + if (swapcache || !large)
> + return swapcache;
> +
> + folio = __alloc_swap_folio(vmf);
> + if (!folio)
> + return NULL;
> +
> + swapcache = swapin_folio(entry, folio);
> + if (swapcache != folio)
> + folio_put(folio);
> + return swapcache;
> +}
> +
> /* Sanity check that a folio is fully exclusive */
> static void check_swap_exclusive(struct folio *folio, swp_entry_t entry,
> unsigned int nr_pages)
> @@ -4860,17 +4898,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> swap_update_readahead(folio, vma, vmf->address);
> if (!folio) {
> if (data_race(si->flags & SWP_SYNCHRONOUS_IO)) {
> - folio = alloc_swap_folio(vmf);
> - if (folio) {
> - /*
> - * folio is charged, so swapin can only fail due
> - * to raced swapin and return NULL.
> - */
> - swapcache = swapin_folio(entry, folio);
> - if (swapcache != folio)
> - folio_put(folio);
> - folio = swapcache;
> - }
> + folio = swapin_synchronous_folio(entry, vmf);
> } else {
> folio = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, vmf);
> }

There are some upcoming changes with:

https://lore.kernel.org/r/20260421-swap-table-p4-v3-5-2f23759a76bc@xxxxxxxxxxx


All the of that logic you have in swapin_synchronous_folio() should ideally not
go into memory.c, but into some swap specific code.

But

https://lore.kernel.org/r/20260421-swap-table-p4-v3-0-2f23759a76bc@xxxxxxxxxxx

Already changes a lot of that.

--
Cheers,

David