Re: [RFC] mm: restrict zero-page remapping to underused THP splits
From: Nico Pache
Date: Mon Jun 08 2026 - 07:25:30 EST
On Mon, Jun 8, 2026 at 5:14 AM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
>
> On 6/8/26 12:34, Nico Pache wrote:
> > On Fri, May 8, 2026 at 3:32 PM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
> >>
> >> On 5/8/26 19:05, Nico Pache wrote:
> >>> Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
> >>> when splitting isolated thp"), splitting an anonymous THP remaps all
> >>> zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
> >>> This flag is set unconditionally for every anonymous folio split,
> >>> including splits triggered by KSM.
> >>
> >> And even when the underused scanner is effectively disabled on a system. Hm.
> >>
> >> I don't quite like that we scan for zeropages when nobody even requested us to
> >> split because of zeropages.
> >>
> >> I can see why we would want to scan for zeropages in a setup where the underused
> >> scanner is active, even when the split was triggered by someone/something else
> >> (below).
> >>
> >> [...]
> >>
> >>> /**
> >>> @@ -4340,7 +4341,13 @@ int folio_split(struct folio *folio, unsigned int new_order,
> >>> struct page *split_at, struct list_head *list)
> >>> {
> >>> return __folio_split(folio, new_order, split_at, &folio->page, list,
> >>> - SPLIT_TYPE_NON_UNIFORM);
> >>> + SPLIT_TYPE_NON_UNIFORM, false);
> >>> +}
> >>> +
> >>> +int folio_split_underused(struct folio *folio)
> >>> +{
> >>> + return __folio_split(folio, 0, &folio->page, &folio->page,
> >>> + NULL, SPLIT_TYPE_NON_UNIFORM, true);
> >>> }
> >>>
> >>> /**
> >>> @@ -4559,7 +4566,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> >>> }
> >>> if (!folio_trylock(folio))
> >>> goto requeue;
> >>> - if (!split_folio(folio)) {
> >>> + if (!folio_split_underused(folio)) {
> >>> did_split = true;
> >>> if (underused)
> >>> count_vm_event(THP_UNDERUSED_SPLIT_PAGE);
> >>
> >> In general, this looks clean.
> >>
> >> But imagine the following: someone splits the THP for another reason: for
> >> example, because migration is unable to allocate a 2M THP, or because we have to
> >> split on swapout etc.
> >>
> >> Not freeing the zero-filled pages means that these pages cannot be reclaimed
> >> anymore easily. We split a possibly underused THP but didn't free the memory.
> >>
> >> The only way to free the memory would be to wait for another collapse, and then
> >> have the new THP be detected as underused.
> >>
> >> Hm.
> >>
> >> (1) As you say, the alternative is to let KSM say that it wants to handle the
> >> zero-filled pages itself. I'm not a the biggest fan of that approach. We still
> >> have two mechanisms interacting to some degree.
> >>
> >> (2) Another approach is to just let KSM handle this in VMAs that are marked as
> >> mergable while KSM is active. That is, we check for VM_MERGABLE and ksm_run ==
> >> KSM_RUN_MERGE in try_to_map_unused_to_zeropage() to just let KSM do its thing.
> >>
> >> That really just stops both mechanisms from interacting.
> >>
> >> (3) Yet another approach I could think of (in general) is to disable the
> >> underused handling in a system where the underused splitting is entirely disabled.
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index e9d499da0ac7..5eca99271957 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -82,6 +82,14 @@ unsigned long huge_anon_orders_madvise __read_mostly;
> >> unsigned long huge_anon_orders_inherit __read_mostly;
> >> static bool anon_orders_configured __initdata;
> >>
> >> +static bool thp_underused_split_active(void)
> >> +{
> >> + if (!split_underused_thp)
> >> + return false;
> >> +
> >> + return khugepaged_max_ptes_none != HPAGE_PMD_NR - 1;
> >> +}
> >> +
> >> static inline bool file_thp_enabled(struct vm_area_struct *vma)
> >> {
> >> struct inode *inode;
> >> @@ -4188,7 +4196,8 @@ static int __folio_split(struct folio *folio, unsigned int
> >> new_order,
> >> if (nr_shmem_dropped)
> >> shmem_uncharge(mapping->host, nr_shmem_dropped);
> >>
> >> - if (!ret && is_anon && !folio_is_device_private(folio))
> >> + if (!ret && is_anon && !folio_is_device_private(folio) &&
> >> + thp_underused_split_active())
> >> ttu_flags = TTU_USE_SHARED_ZEROPAGE;
> >>
> >> remap_page(folio, 1 << old_order, ttu_flags);
> >> @@ -4497,7 +4506,7 @@ static bool thp_underused(struct folio *folio)
> >> int num_zero_pages = 0, num_filled_pages = 0;
> >> int i;
> >>
> >> - if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
> >> + if (!thp_underused_split_active())
> >> return false;
> >>
> >> if (folio_contain_hwpoisoned_page(folio))
> >>
> >>
> >>
> >> I tend to like (2), and maybe (3) on top. Opinions?
> >
> > Coming back to this.
> >
> > for (2), I have to export the KSM run state, which may be fine, but it
>
> Right, you just need a simple helper.
ack. Ill add that peice back! thanks
>
> > introduces a race window. If a user disables KSM, a split occurs, and
> > then re-enables it, the bug will present itself again.
>
> Who cares?
Haha, okay, fair enough-- was just making sure before I resent.
>
> >
> > Would it be better to just check if its VM_MERGEABLE?
>
> Some user space unconditionally sets VM_MERGEABLE, even if KSM is never enabled
> (IIRC QEMU, for example).
ack! I'll keep the is_ksm_running() check :)
Cheers,
-- Nico
>
> --
> Cheers,
>
> David
>