Re: [RFC PATCH] mm: Avoiding split large folios if swap has no space
From: Baolin Wang
Date: Mon Jun 22 2026 - 00:11:51 EST
On 6/22/26 11:36 AM, Barry Song wrote:
On Mon, Jun 22, 2026 at 11:04 AM Baolin Wang
<baolin.wang@xxxxxxxxxxxxxxxxx> wrote:
On 6/20/26 4:10 PM, Barry Song (Xiaomi) wrote:
On Fri, Jun 19, 2026 at 10:04 PM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
[...]
/*
* The page can not be swapped.
*
@@ -1280,6 +1289,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
if (!folio_test_large(folio))
goto activate_locked_split;
+ if (!__can_reclaim_anon_pages(memcg, sc))
+ goto activate_locked_split;
Why are we even trying to allocate swap space if we cannot reclaim such pages?
Makes we wonder whether we would want to have that check earlier, before the
folio_alloc_swap().
Any downsides?
I don't think there are any obvious downsides there. One issue is that
the memcg may not be passed from reclaim_pages(), so memcg would
always be NULL. However, the folio could still belong to a memcg
whose swap quota has been exhausted. In that case, my
__can_reclaim_anon_pages() will fail when checking whether we can
swap out. But switching to folio_memcg() also seems awkward.
So I feel Kairui’s suggestion [1] might be the best approach. In
folio_alloc_swap(), we return -EAGAIN to tell vmscan.c that
we can split the folio and retry the swap-out.
only when there are sufficient swap slots and sufficient memcg swap
quota do we return -EAGAIN, allowing vmscan to perform a split.
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 78b49b0658ad..62e2c506ccae 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1755,6 +1755,9 @@ int folio_alloc_swap(struct folio *folio)
VM_WARN_ON_ONCE(1);
return -EINVAL;
}
+
+ if (get_nr_swap_pages() < (1 << order))
+ return -ENOMEM;
Shouldn't this return -EAGAIN? Suppose we try to swap out an order-9
large folio but get_nr_swap_pages() returns 256, then we'd still need to
split the order-9 large folio to reclaim some memory.
I guess Kairui has the opposite view. Quoting Kairui:
"- 1. the mem_cgroup_try_charge_swap in it failed
- 2. allocation failed but nr_swap_pages > folio size
- 3. allocation failed because all devices are full or unusable
(roughly nr_swap_pages < folio size)
Only case 2 requires splitting. __can_reclaim_anon_pages also checks
demote which is not related to swap."
For example, if get_nr_swap_pages() returns 1, we can only
swap out a single page after splitting. Is that really worth
it? On the other hand, if it returns 256, we could at least
swap out half of an order-9 large folio. Isn't that worthwhile?
The former doesn't seem necessary, but the latter might be worthwhile.
After splitting, we could potentially avoid premature OOM. Of course, splitting also means losing an order-9 large folio, but we can rely on khugepaged to collapse it into a THP again. But the consequences of OOM are arguably more severe?
Anyway, just raising my concern here (I don't have data at this point to justify whether more complex logic is needed).
again:
@@ -1769,11 +1772,13 @@ int folio_alloc_swap(struct folio *folio)
}
/* Need to call this even if allocation failed, for MEMCG_SWAP_FAIL. */
- if (unlikely(mem_cgroup_try_charge_swap(folio)))
+ if (unlikely(mem_cgroup_try_charge_swap(folio))) {
swap_cache_del_folio(folio);
+ return -ENOMEM;
+ }
if (unlikely(!folio_test_swapcache(folio)))
- return -ENOMEM;
+ return -EAGAIN;
return 0;
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 299b5d9e8836..63e8578454ea 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1257,6 +1257,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
*/
if (folio_test_anon(folio) && folio_test_swapbacked(folio) &&
!folio_test_swapcache(folio)) {
+ int ret;
+
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
if (folio_maybe_dma_pinned(folio))
@@ -1275,10 +1277,10 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
split_folio_to_list(folio, folio_list))
goto activate_locked;
}
- if (folio_alloc_swap(folio)) {
+ if ((ret = folio_alloc_swap(folio))) {
Also, please give shmem some love (shmem also calls folio_alloc_swap()
when swapping out) :)
Sure. I'm pretty sure I love shmem. :-)
Thanks.