Re: [PATCH mm-unstable v4 5/5] mm/khugepaged: unify khugepaged and madv_collapse with collapse_single_pmd()

From: Lorenzo Stoakes (Oracle)

Date: Tue Mar 31 2026 - 16:08:18 EST


On Tue, Mar 31, 2026 at 10:00:58PM +0200, David Hildenbrand (Arm) wrote:
> On 3/31/26 16:15, Lorenzo Stoakes (Oracle) wrote:
> > On Tue, Mar 31, 2026 at 04:13:29PM +0200, David Hildenbrand (Arm) wrote:
> >>>
> >>> So this is the bug. 'lock_dropped' needs to record if the lock was _ever_
> >>> dropped, not if it is _currently_ dropped.
> >>
> >> Ah, well spotted. I thought we discussed that at some point during
> >> review ...
> >>
> >> Thanks for tackling this!
> >>
> >> [...]
> >>
> >>>
> >>> ----8<----
> >>> From a4dfc7718a15035449f344a0bc7f58e449366405 Mon Sep 17 00:00:00 2001
> >>> From: "Lorenzo Stoakes (Oracle)" <ljs@xxxxxxxxxx>
> >>> Date: Tue, 31 Mar 2026 13:11:18 +0100
> >>> Subject: [PATCH] mm/khugepaged: fix issue with tracking lock
> >>>
> >>> We are incorrectly treating lock_dropped to track both whether the lock is
> >>> currently held and whether or not the lock was ever dropped.
> >>>
> >>> Update this change to account for this.
> >>>
> >>> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@xxxxxxxxxx>
> >>> ---
> >>> mm/khugepaged.c | 12 ++++++++----
> >>> 1 file changed, 8 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> >>> index d21348b85a59..b8452dbdb043 100644
> >>> --- a/mm/khugepaged.c
> >>> +++ b/mm/khugepaged.c
> >>> @@ -2828,6 +2828,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> >>> unsigned long hstart, hend, addr;
> >>> enum scan_result last_fail = SCAN_FAIL;
> >>> int thps = 0;
> >>> + bool mmap_unlocked = false;
> >>>
> >>> BUG_ON(vma->vm_start > start);
> >>> BUG_ON(vma->vm_end < end);
> >>> @@ -2850,10 +2851,11 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> >>> for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) {
> >>> enum scan_result result = SCAN_FAIL;
> >>>
> >>> - if (*lock_dropped) {
> >>> + if (mmap_unlocked) {
> >>> cond_resched();
> >>> mmap_read_lock(mm);
> >>> - *lock_dropped = false;
> >>> + mmap_unlocked = false;
> >>> + *lock_dropped = true;
> >>> result = hugepage_vma_revalidate(mm, addr, false, &vma,
> >>> cc);
> >>> if (result != SCAN_SUCCEED) {
> >>> @@ -2864,7 +2866,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> >>
> >> (in hurry so I might be wrong)
> >>
> >> Do we have to handle when collapse_single_pmd() dropped the lock as well?
> >
> > That updates mmap_unlocked which will be handled in the loop or on exit:
> >
> > if (mmap_unlocked) {
> > ...
> > mmap_read_lock(mm);
> > mmap_unlocked = false;
> > *lock_dropped = true;
> > ...
> > }
> >
> > result = collapse_single_pmd(addr, vma, &mmap_unlocked, cc);
> >
> > ...
> >
> > out_maybelock:
> > /* Caller expects us to hold mmap_lock on return */
> > if (mmap_unlocked) {
> > *lock_dropped = true;
> > mmap_read_lock(mm);
> > }
> >
>
> Right.
>
> The original code used
>
> bool mmap_locked = true;
>
> If the fix get squashed, maybe we should revert to that handling to
> minimize the churn?

Well collapse_single_pmd() and all descendents would need to be updated to
invert the boolean, not sure it's worth it.

>
> (this is not in stable, right?)
>
> --
> Cheers,
>
> David