Re: [PATCH mm-unstable 1/1] mm/khugepaged: fix PMD collapse swap PTE accounting
From: Nico Pache
Date: Tue Jun 09 2026 - 13:25:06 EST
On Tue, Jun 9, 2026 at 7:16 AM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
>
> On 6/9/26 14:04, Lance Yang wrote:
> > From: Lance Yang <lance.yang@xxxxxxxxx>
> >
> > mthp_collapse() uses mthp_present_ptes to decide whether a range has
> > enough occupied PTEs to try collapse. Swap PTEs accepted by
> > collapse_scan_pmd() are counted in unmapped, but are not represented in
> > mthp_present_ptes.
> >
> > When lower orders are enabled, collapse_scan_pmd() relaxes max_ptes_none
> > so the scan can cover the whole PMD and build the bitmap. mthp_collapse()
> > then checks the PMD-order candidate using the bitmap.
> >
> > With max_ptes_none set to 0, a range with 511 present PTEs and one swap
> > PTE no longer reaches collapse_huge_page(), even though PMD collapse can
> > handle swap PTEs up to max_ptes_swap.
> >
> > Account unmapped PTEs only for PMD order. PMD collapse supports swap PTEs
> > through max_ptes_swap, while lower-order mTHP collapse does not currently
> > support non-present PTEs. Keep non-present PTEs out of the lower-order
> > eligibility check.
> >
> > Signed-off-by: Lance Yang <lance.yang@xxxxxxxxx>
> > ---
> > Sent separately, as discussed in [1], to spell out the PMD-order swap PTE
> > case. Patch [2] is still only in mm-unstable, so no Fixes: tag.
> >
> > [1] https://lore.kernel.org/linux-mm/CAA1CXcD7WAiA1b9GTLAuNZ+kHaFx0SzZwpBkqAZ=s+RHsTUaow@xxxxxxxxxxxxxx/
> > [2] https://lore.kernel.org/linux-mm/20260605161422.213817-12-npache@xxxxxxxxxx/
> >
> > mm/khugepaged.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index b12187709f6d..617bca76db49 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -1508,6 +1508,14 @@ static enum scan_result mthp_collapse(struct mm_struct *mm,
> > nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
> > offset + nr_ptes);
> >
> > + /*
> > + * Swap PTEs accepted during the scan are counted in @unmapped,
> > + * not in the present-PTE bitmap. Account them for the PMD-order
> > + * candidate.
> > + */
> > + if (is_pmd_order(order))
> > + nr_occupied_ptes += unmapped;
> > +
>
> LGTM, there is a bit of opportunity for cleanup in the future :)
>
> Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
>
>
> For example, as we no longer have the VMA here, collapse_max_ptes_none is
> imprecise in uffd VMAs. We might try collapsing where there sure is nothing to
> collapse.
>
> We could likely handle the userfaultfd_armed() part easier: some indication that
> we must not have any pte_none() would be sufficient.
>
> Also, I don't see a good reason why uffd would not be allowed to collapse with
> zeropages ... it's really just about missing faults due to pte_none().
I have some patches exactly for this :) so far 2-3 patches for better
uffd handling
>
> --
> Cheers,
>
> David
>