Re: [PATCH v2 4/4] mm: khugepaged: set to next mm direct when mm has MMF_DISABLE_THP_COMPLETELY
From: Wei Yang
Date: Wed Dec 31 2025 - 21:04:57 EST
On Wed, Dec 31, 2025 at 01:21:12PM +0100, David Hildenbrand (Red Hat) wrote:
>On 12/31/25 03:51, Wei Yang wrote:
>> On Tue, Dec 30, 2025 at 09:03:23PM +0100, David Hildenbrand (Red Hat) wrote:
>> > On 12/29/25 06:51, Vernon Yang wrote:
>> > > When an mm with the MMF_DISABLE_THP_COMPLETELY flag is detected during
>> > > scanning, directly set khugepaged_scan.mm_slot to the next mm_slot,
>> > > reduce redundant operation.
>> > >
>> > > Signed-off-by: Vernon Yang <yanglincheng@xxxxxxxxxx>
>> > > ---
>> > > mm/khugepaged.c | 9 +++++++--
>> > > 1 file changed, 7 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> > > index 2b3685b195f5..72be87ef384b 100644
>> > > --- a/mm/khugepaged.c
>> > > +++ b/mm/khugepaged.c
>> > > @@ -2439,6 +2439,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
>> > > cond_resched();
>> > > if (unlikely(hpage_collapse_test_exit_or_disable(mm))) {
>> > > + vma = NULL;
>> > > progress++;
>> > > break;
>> > > }
>> >
>> > I don't understand why we need changes at all.
>> >
>> > The code is
>> >
>> > mm = slot->mm;
>> > /*
>> > * Don't wait for semaphore (to avoid long wait times). Just move to
>> > * the next mm on the list.
>> > */
>> > vma = NULL;
>> > if (unlikely(!mmap_read_trylock(mm)))
>> > goto breakouterloop_mmap_lock;
>> >
>> > progress++;
>> > if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
>> > goto breakouterloop;
>> >
>> > ...
>> >
>> > So we'll go straight to breakouterloop with vma=NULL.
>> >
>> > Do you want to optimize for skipping the MM if the flag gets toggled
>> > while we are scanning that MM?
>> >
>> > Is that really something we should be worrying about?
>> >
>> > Also, why can't we simply do a
>> >
>> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> > index 97d1b2824386f..af8481d4b0f4e 100644
>> > --- a/mm/khugepaged.c
>> > +++ b/mm/khugepaged.c
>> > @@ -2516,7 +2516,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
>> > * Release the current mm_slot if this mm is about to die, or
>> > * if we scanned all vmas of this mm.
>> > */
>> > - if (hpage_collapse_test_exit(mm) || !vma) {
>> > + if (hpage_collapse_test_exit_or_disable(mm) || !vma) {
>> > /*
>> > * Make sure that if mm_users is reaching zero while
>> > * khugepaged runs here, khugepaged_exit will find
>> >
>>
>> This one looks better.
>>
>> But the sad thing is we can't remove this mm from scan list, since user may
>> toggle this flag later.
>
>In theory we could readd it to the list once the flag gets toggled.
>
Currently we use khugepaged_enter_vma() to add one mm to scan list based on
vma property, while toggling the flag MMF_DISABLE_THP_COMPLETELY is based on
mm. If we want to readd it in prctl_set_thp_disable(), we need to change the
semantic of khugepaged_enter_vma() or introduce another interface?
>In fact, we could remove it from the list once we set the flag. But not sure
>if that ends up any cleaner (dealing with races? not sure).
>
Removal is clear to me, but my concern is how we add it back. Looks a little
unclear to me as described above.
Another thing is how much "thp disabled" processes would we have in system?
If not that much, check the flag and skip it in the scan looks enough.
BTW, if we can skip all thp mapped process during scan looks more benefit?
>--
>Cheers
>
>David
--
Wei Yang
Help you, Help me