Re: [RFC PATCH] mm: Avoiding split large folios if swap has no space

From: David Hildenbrand (Arm)

Date: Mon Jun 22 2026 - 04:58:40 EST


On 6/20/26 10:10, Barry Song (Xiaomi) wrote:
> On Fri, Jun 19, 2026 at 10:04 PM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
> [...]
>>> /*
>>> * The page can not be swapped.
>>> *
>>> @@ -1280,6 +1289,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>>>
>>> if (!folio_test_large(folio))
>>> goto activate_locked_split;
>>> + if (!__can_reclaim_anon_pages(memcg, sc))
>>> + goto activate_locked_split;
>>
>> Why are we even trying to allocate swap space if we cannot reclaim such pages?
>> Makes we wonder whether we would want to have that check earlier, before the
>> folio_alloc_swap().
>>
>> Any downsides?
>
> I don't think there are any obvious downsides there. One issue is that
> the memcg may not be passed from reclaim_pages(), so memcg would
> always be NULL. However, the folio could still belong to a memcg
> whose swap quota has been exhausted. In that case, my
> __can_reclaim_anon_pages() will fail when checking whether we can
> swap out. But switching to folio_memcg() also seems awkward.
>
> So I feel Kairui’s suggestion [1] might be the best approach. In
> folio_alloc_swap(), we return -EAGAIN to tell vmscan.c that
> we can split the folio and retry the swap-out.
> only when there are sufficient swap slots and sufficient memcg swap
> quota do we return -EAGAIN, allowing vmscan to perform a split.
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 78b49b0658ad..62e2c506ccae 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1755,6 +1755,9 @@ int folio_alloc_swap(struct folio *folio)
> VM_WARN_ON_ONCE(1);
> return -EINVAL;
> }
> +
> + if (get_nr_swap_pages() < (1 << order))
> + return -ENOMEM;

I guess we could be clearer with the return value:

-> !get_nr_swap_pages() -> -ENOSPC / -ENOMEM

(no space at all)

-> get_nr_swap_pages() < (1 << order) -> -E2BIG

(there is some space, but not for the full thing)

But now I wonder whether we would also want to check "is there any free swap
space", not just "is there any swap".


Essentially, try returning -E2BIG if there is the chance to swap out after
split, and -ENOSPC / -ENOMEM if a split wouldn't help.

> }
>
> again:
> @@ -1769,11 +1772,13 @@ int folio_alloc_swap(struct folio *folio)
> }
>
> /* Need to call this even if allocation failed, for MEMCG_SWAP_FAIL. */
> - if (unlikely(mem_cgroup_try_charge_swap(folio)))
> + if (unlikely(mem_cgroup_try_charge_swap(folio))) {
> swap_cache_del_folio(folio);
> + return -ENOMEM;

Here we wouldn't have the information whether we could charge after a split.

So that would require a rework to signal this more cleanly to the caller.

> + }
>
> if (unlikely(!folio_test_swapcache(folio)))
> - return -ENOMEM;
> + return -EAGAIN;
>
> return 0;
> }
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 299b5d9e8836..63e8578454ea 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1257,6 +1257,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> */
> if (folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> !folio_test_swapcache(folio)) {
> + int ret;
> +
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (folio_maybe_dma_pinned(folio))
> @@ -1275,10 +1277,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> split_folio_to_list(folio, folio_list))
> goto activate_locked;
> }
> - if (folio_alloc_swap(folio)) {
> + if ((ret = folio_alloc_swap(folio))) {

I prefer doing the assignment outside the conditional.

> int __maybe_unused order = folio_order(folio);
>
> - if (!folio_test_large(folio))
> + if (!folio_test_large(folio) || ret != -EAGAIN)
> goto activate_locked_split;
> /* Fallback to swap normal pages */
> if (split_folio_to_list(folio, folio_list))
>
> What’s your view on this, David?

I guess returning from folio_alloc_swap() whether a split could allow for
swapout (e.g., -E2BIG) would be reasonable.

To catch all the cases where it makes a difference:
* No free swap space (split won't work)
* Some free swap space (split would work)
* Sufficient free swap space, but fragmented (split would work)
* No memcg space (split won't work)
* Some memcg space (split would work)

--
Cheers,

David