Re: [RFC PATCH] mm: Avoiding split large folios if swap has no space

From: Kairui Song

Date: Fri Jun 19 2026 - 15:18:23 EST


On Fri, Jun 19, 2026 at 6:17 AM Barry Song (Xiaomi) <baohua@xxxxxxxxxx> wrote:
>
> When swap is disabled or exhausted, swap slot allocation
> may fail during swapout, causing large folios to be split
> into small folios. The splitting is reasonable when we
> truly fail to obtain contiguous swap slots, but it is
> pointless in the no-space case.
>
> A simple way to reproduce this is to invoke MADV_PAGEOUT on
> a system with mTHP enabled but without swap configured.
>
> #define SIZE (16 * 1024 * 1024)
> int main(void)
> {
> char *buf = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> memset(buf, 1, SIZE);
> madvise(buf, SIZE, MADV_PAGEOUT);
> munmap(buf, SIZE);
> return 0;
> }
>
> With 16KB mTHP enabled, we observe:
> ~ # cat /sys/kernel/mm/transparent_hugepage/hugepages-16kB/stats/split
> 1024
>
> This patch checks swap space before splitting. If there is
> no available space, it skips splitting. After the patch, we
> observe:
> ~ # cat /sys/kernel/mm/transparent_hugepage/hugepages-16kB/stats/split
> 0
>
> Reported-by: Nanzhe Zhao <zhaonanzhe@xxxxxxxxxx>
> Cc: David Hildenbrand <david@xxxxxxxxxx>
> Cc: Lorenzo Stoakes <ljs@xxxxxxxxxx>
> Cc: Zi Yan <ziy@xxxxxxxxxx>
> Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
> Cc: Liam R. Howlett <liam@xxxxxxxxxxxxx>
> Cc: Nico Pache <npache@xxxxxxxxxx>
> Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
> Cc: Dev Jain <dev.jain@xxxxxxx>
> Cc: Lance Yang <lance.yang@xxxxxxxxx>
> Cc: Kairui Song <kasong@xxxxxxxxxxx>
> Cc: Qi Zheng <qi.zheng@xxxxxxxxx>
> Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx>
> Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>
> Cc: Yuanchu Xie <yuanchu@xxxxxxxxxx>
> Cc: Wei Xu <weixugc@xxxxxxxxxx>
> Signed-off-by: Barry Song (Xiaomi) <baohua@xxxxxxxxxx>
> ---
> mm/vmscan.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 299b5d9e8836..33f84a5fe7ee 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -339,8 +339,7 @@ static bool can_demote(int nid, struct scan_control *sc,
> return !nodes_empty(allowed_mask);
> }
>
> -static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
> - int nid,
> +static inline bool __can_reclaim_anon_pages(struct mem_cgroup *memcg,
> struct scan_control *sc)
> {
> if (memcg == NULL) {
> @@ -356,6 +355,16 @@ static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
> return true;
> }
>
> + return false;
> +}
> +
> +static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
> + int nid,
> + struct scan_control *sc)
> +{
> + if (__can_reclaim_anon_pages(memcg, sc))
> + return true;
> +
> /*
> * The page can not be swapped.
> *
> @@ -1280,6 +1289,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>
> if (!folio_test_large(folio))
> goto activate_locked_split;
> + if (!__can_reclaim_anon_pages(memcg, sc))
> + goto activate_locked_split;
> /* Fallback to swap normal pages */
> if (split_folio_to_list(folio, folio_list))
> goto activate_locked;

Hello Barry,

Thanks for raising this issue. I saw a similar issue report in the
mail list before and was thinking that, perhaps another approach is to
let folio_alloc_swap return a more detailed error code, for example:

- 1. the mem_cgroup_try_charge_swap in it failed
- 2. allocation failed but nr_swap_pages > folio size
- 3. allocation failed because all devices are full or unusable
(roughly nr_swap_pages < folio size)

Only case 2 requires splitting. __can_reclaim_anon_pages also checks
demote which is not related to swap.