Re: [PATCH mm-unstable v19 11/14] mm/khugepaged: Introduce mTHP collapse support

From: David Hildenbrand (Arm)

Date: Tue Jun 09 2026 - 05:39:29 EST


On 6/9/26 11:06, Nico Pache wrote:
> On Mon, Jun 8, 2026 at 8:57 AM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
>>
>> On 6/6/26 12:28, Lance Yang wrote:
>>>
>>>
>>> Looks broken for swap PTEs in PMD collapse ...
>>>
>>> collapse_scan_pmd() allows them up to max_ptes_swap and record them in
>>> unmapped, but they don't get a bit in mthp_present_ptes. And then
>>> mthp_collapse() does the check above:
>>
>> Right. I assumed this is implicitly handled by the optimization in collapse_scan_pmd:
>>
>> if (enabled_orders != BIT(HPAGE_PMD_ORDER))
>> max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT;
>>
>> But we perform the check a second time.
>>
>>>
>>> nr_occupied_ptes >= nr_ptes - max_ptes_none
>>>
>>> So max_ptes_none=0 + 511 present PTEs + one allowed swap PTE won't even
>>> call collapse_huge_page() for PMD order.
>>>
>>> Shouldn't we account for them in the PMD-order check? Something like:
>>>
>>> if (is_pmd_order(order))
>>> nr_occupied_ptes += unmapped;
>
> This solution seems good for a temporary fixup. but longterm we may
> want something else. I'm still not sure how we plan on supporting
> swapin without causing creep. So I'd be ok with adding a fix for
> legacy PMD behavior until we know how to handle mTHP creep correctly.
>
>> As an alternative, we could either 1) skip the check there for
>> pmd order (as the check was already done); or 2) introduce+maintain
>> a bitmap that tracks non-present PTEs.
>>
>> @@ -1475,7 +1477,9 @@ static enum scan_result mthp_collapse(struct mm_struct *mm,
>> nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
>> offset + nr_ptes);
>>
>> - if (nr_occupied_ptes >= nr_ptes - max_ptes_none) {
>> + /* Check was already done in the caller. */
>> + if (is_pmd_order(order) ||
>> + nr_occupied_ptes >= nr_ptes - max_ptes_none) {
>> enum scan_result ret;
>>
>> collapse_address = address + offset * PAGE_SIZE;
>>
>> 2) would probably be cleanest long-term.
>
> That would be best for future swapin support in mTHP, but I still
> don't think it solves the creep issue.

It wouldn't, we'd simply maintain the state we collect + rely on in separate
bitmaps. On swapin, we'd have to update/refresh bitmaps I guess.

> Perhaps we could combine the
> two bitmaps to determine if it would make the future collapse eligible
> again? Not sure but ill start thinking about it.
>
> Should I send a fixup for this using Lance's solution? Or does Lance
> want to send a patch out with the fixes tag?

If Lance could send a fixup, explaining the situation, that would be nice.

--
Cheers,

David