Re: [RFC 03/11] khugepaged: Don't allocate khugepaged mm_slot early
From: Nico Pache
Date: Fri Jan 10 2025 - 14:38:37 EST
On Thu, Jan 9, 2025 at 11:11 PM Dev Jain <dev.jain@xxxxxxx> wrote:
>
>
>
> On 09/01/25 5:01 am, Nico Pache wrote:
> > We should only "enter"/allocate the khugepaged mm_slot if we succeed at
> > allocating the PMD sized folio. Move the khugepaged_enter_vma call until
> > after we know the vma_alloc_folio was successful.
>
> Why? We have the appropriate checks from thp_vma_allowable_orders() and
> friends, so the VMA should be registered with khugepaged irrespective of
> whether during fault time we are able to allocate a PMD-THP or not. If
> we fail at fault time, it is the job of khugepaged to try to collapse it
> later.
That's a fair point. This was written a while back when I first
started looking into khugepaged. I believe the current schema for
khugepaged_enter_vma is to only register when there is a mapping large
enough for khugepaged to work on. I'd like to remove this restriction
in the future to simplify the entry points of khugepaged. Currently we
need to add these khugepaged_enter_vma functions all over the place,
ideally we just register everything with khugepaged.
Either way, you are correct, even if we FALLBACK, the mapping would
still be eligible for promotion in the future.
Ill drop this patch. Thanks!
> >
> > Signed-off-by: Nico Pache <npache@xxxxxxxxxx>
> > ---
> > mm/huge_memory.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e53d83b3e5cf..635c65e7ef63 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1323,7 +1323,6 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
> > ret = vmf_anon_prepare(vmf);
> > if (ret)
> > return ret;
> > - khugepaged_enter_vma(vma, vma->vm_flags);
> >
> > if (!(vmf->flags & FAULT_FLAG_WRITE) &&
> > !mm_forbids_zeropage(vma->vm_mm) &&
> > @@ -1365,7 +1364,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
> > }
> > return ret;
> > }
> > -
> > + khugepaged_enter_vma(vma, vma->vm_flags);
> > return __do_huge_pmd_anonymous_page(vmf);
> > }
> >
>
> In any case, you are not achieving what you described in the patch
> description: you have moved khugepaged_enter_vma() after the read fault
> logic, what you want to do is to move it after
> vma_alloc_anon_folio_pmd() in __do_huge_pmd_anonymous_page().
Good catch! This was a byproduct of a rebase... back when i wrote this
the vma_alloc_folio was in the do_huge_pmd_anonymous_page function.
>