Re: [PATCH mm-unstable 1/1] mm/khugepaged: fix PMD collapse swap PTE accounting
From: Lorenzo Stoakes
Date: Tue Jun 09 2026 - 10:53:44 EST
On Tue, Jun 09, 2026 at 03:16:10PM +0200, David Hildenbrand (Arm) wrote:
> On 6/9/26 14:04, Lance Yang wrote:
> > From: Lance Yang <lance.yang@xxxxxxxxx>
> >
> > mthp_collapse() uses mthp_present_ptes to decide whether a range has
> > enough occupied PTEs to try collapse. Swap PTEs accepted by
> > collapse_scan_pmd() are counted in unmapped, but are not represented in
> > mthp_present_ptes.
> >
> > When lower orders are enabled, collapse_scan_pmd() relaxes max_ptes_none
> > so the scan can cover the whole PMD and build the bitmap. mthp_collapse()
> > then checks the PMD-order candidate using the bitmap.
> >
> > With max_ptes_none set to 0, a range with 511 present PTEs and one swap
> > PTE no longer reaches collapse_huge_page(), even though PMD collapse can
> > handle swap PTEs up to max_ptes_swap.
> >
> > Account unmapped PTEs only for PMD order. PMD collapse supports swap PTEs
> > through max_ptes_swap, while lower-order mTHP collapse does not currently
> > support non-present PTEs. Keep non-present PTEs out of the lower-order
> > eligibility check.
> >
> > Signed-off-by: Lance Yang <lance.yang@xxxxxxxxx>
> > ---
> > Sent separately, as discussed in [1], to spell out the PMD-order swap PTE
> > case. Patch [2] is still only in mm-unstable, so no Fixes: tag.
> >
> > [1] https://lore.kernel.org/linux-mm/CAA1CXcD7WAiA1b9GTLAuNZ+kHaFx0SzZwpBkqAZ=s+RHsTUaow@xxxxxxxxxxxxxx/
> > [2] https://lore.kernel.org/linux-mm/20260605161422.213817-12-npache@xxxxxxxxxx/
> >
> > mm/khugepaged.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index b12187709f6d..617bca76db49 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1508,6 +1508,14 @@ static enum scan_result mthp_collapse(struct mm_struct *mm,
> > nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
> > offset + nr_ptes);
> >
> > + /*
> > + * Swap PTEs accepted during the scan are counted in @unmapped,
> > + * not in the present-PTE bitmap. Account them for the PMD-order
> > + * candidate.
> > + */
> > + if (is_pmd_order(order))
> > + nr_occupied_ptes += unmapped;
> > +
>
> LGTM, there is a bit of opportunity for cleanup in the future :)
>From my point of view, accepting the mTHP khugepaged changes was essentially a
big compromise on how much it adds to the mess of the existing code base, and
AFAIC we shouldn't accept any further major changes until we actually sort this
mess out :)
>
> Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
>
>
> For example, as we no longer have the VMA here, collapse_max_ptes_none is
> imprecise in uffd VMAs. We might try collapsing where there sure is nothing to
> collapse.
>
> We could likely handle the userfaultfd_armed() part easier: some indication that
> we must not have any pte_none() would be sufficient.
>
> Also, I don't see a good reason why uffd would not be allowed to collapse with
> zeropages ... it's really just about missing faults due to pte_none().
Ugh uffd.
>
> --
> Cheers,
>
> David
Cheers, Lorenzo