Re: [RFC] mm: restrict zero-page remapping to underused THP splits

From: David Hildenbrand (Arm)

Date: Mon Jun 08 2026 - 07:14:36 EST


On 6/8/26 12:34, Nico Pache wrote:
> On Fri, May 8, 2026 at 3:32 PM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
>>
>> On 5/8/26 19:05, Nico Pache wrote:
>>> Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
>>> when splitting isolated thp"), splitting an anonymous THP remaps all
>>> zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
>>> This flag is set unconditionally for every anonymous folio split,
>>> including splits triggered by KSM.
>>
>> And even when the underused scanner is effectively disabled on a system. Hm.
>>
>> I don't quite like that we scan for zeropages when nobody even requested us to
>> split because of zeropages.
>>
>> I can see why we would want to scan for zeropages in a setup where the underused
>> scanner is active, even when the split was triggered by someone/something else
>> (below).
>>
>> [...]
>>
>>> /**
>>> @@ -4340,7 +4341,13 @@ int folio_split(struct folio *folio, unsigned int new_order,
>>> struct page *split_at, struct list_head *list)
>>> {
>>> return __folio_split(folio, new_order, split_at, &folio->page, list,
>>> - SPLIT_TYPE_NON_UNIFORM);
>>> + SPLIT_TYPE_NON_UNIFORM, false);
>>> +}
>>> +
>>> +int folio_split_underused(struct folio *folio)
>>> +{
>>> + return __folio_split(folio, 0, &folio->page, &folio->page,
>>> + NULL, SPLIT_TYPE_NON_UNIFORM, true);
>>> }
>>>
>>> /**
>>> @@ -4559,7 +4566,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>>> }
>>> if (!folio_trylock(folio))
>>> goto requeue;
>>> - if (!split_folio(folio)) {
>>> + if (!folio_split_underused(folio)) {
>>> did_split = true;
>>> if (underused)
>>> count_vm_event(THP_UNDERUSED_SPLIT_PAGE);
>>
>> In general, this looks clean.
>>
>> But imagine the following: someone splits the THP for another reason: for
>> example, because migration is unable to allocate a 2M THP, or because we have to
>> split on swapout etc.
>>
>> Not freeing the zero-filled pages means that these pages cannot be reclaimed
>> anymore easily. We split a possibly underused THP but didn't free the memory.
>>
>> The only way to free the memory would be to wait for another collapse, and then
>> have the new THP be detected as underused.
>>
>> Hm.
>>
>> (1) As you say, the alternative is to let KSM say that it wants to handle the
>> zero-filled pages itself. I'm not a the biggest fan of that approach. We still
>> have two mechanisms interacting to some degree.
>>
>> (2) Another approach is to just let KSM handle this in VMAs that are marked as
>> mergable while KSM is active. That is, we check for VM_MERGABLE and ksm_run ==
>> KSM_RUN_MERGE in try_to_map_unused_to_zeropage() to just let KSM do its thing.
>>
>> That really just stops both mechanisms from interacting.
>>
>> (3) Yet another approach I could think of (in general) is to disable the
>> underused handling in a system where the underused splitting is entirely disabled.
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index e9d499da0ac7..5eca99271957 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -82,6 +82,14 @@ unsigned long huge_anon_orders_madvise __read_mostly;
>> unsigned long huge_anon_orders_inherit __read_mostly;
>> static bool anon_orders_configured __initdata;
>>
>> +static bool thp_underused_split_active(void)
>> +{
>> + if (!split_underused_thp)
>> + return false;
>> +
>> + return khugepaged_max_ptes_none != HPAGE_PMD_NR - 1;
>> +}
>> +
>> static inline bool file_thp_enabled(struct vm_area_struct *vma)
>> {
>> struct inode *inode;
>> @@ -4188,7 +4196,8 @@ static int __folio_split(struct folio *folio, unsigned int
>> new_order,
>> if (nr_shmem_dropped)
>> shmem_uncharge(mapping->host, nr_shmem_dropped);
>>
>> - if (!ret && is_anon && !folio_is_device_private(folio))
>> + if (!ret && is_anon && !folio_is_device_private(folio) &&
>> + thp_underused_split_active())
>> ttu_flags = TTU_USE_SHARED_ZEROPAGE;
>>
>> remap_page(folio, 1 << old_order, ttu_flags);
>> @@ -4497,7 +4506,7 @@ static bool thp_underused(struct folio *folio)
>> int num_zero_pages = 0, num_filled_pages = 0;
>> int i;
>>
>> - if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
>> + if (!thp_underused_split_active())
>> return false;
>>
>> if (folio_contain_hwpoisoned_page(folio))
>>
>>
>>
>> I tend to like (2), and maybe (3) on top. Opinions?
>
> Coming back to this.
>
> for (2), I have to export the KSM run state, which may be fine, but it

Right, you just need a simple helper.

> introduces a race window. If a user disables KSM, a split occurs, and
> then re-enables it, the bug will present itself again.

Who cares?

>
> Would it be better to just check if its VM_MERGEABLE?

Some user space unconditionally sets VM_MERGEABLE, even if KSM is never enabled
(IIRC QEMU, for example).

--
Cheers,

David