Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code

From: Peter Xu
Date: Fri Mar 22 2024 - 22:15:36 EST


On Fri, Mar 22, 2024 at 08:45:59PM -0400, Peter Xu wrote:
> On Fri, Mar 22, 2024 at 01:48:18PM -0700, Andrew Morton wrote:
> > On Thu, 21 Mar 2024 18:08:02 -0400 peterx@xxxxxxxxxx wrote:
> >
> > > From: Peter Xu <peterx@xxxxxxxxxx>
> > >
> > > Now follow_page() is ready to handle hugetlb pages in whatever form, and
> > > over all architectures. Switch to the generic code path.
> > >
> > > Time to retire hugetlb_follow_page_mask(), following the previous
> > > retirement of follow_hugetlb_page() in 4849807114b8.
> > >
> > > There may be a slight difference of how the loops run when processing slow
> > > GUP over a large hugetlb range on cont_pte/cont_pmd supported archs: each
> > > loop of __get_user_pages() will resolve one pgtable entry with the patch
> > > applied, rather than relying on the size of hugetlb hstate, the latter may
> > > cover multiple entries in one loop.
> > >
> > > A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over
> > > a tight loop of slow gup after the path switched. That shouldn't be a
> > > problem because slow-gup should not be a hot path for GUP in general: when
> > > page is commonly present, fast-gup will already succeed, while when the
> > > page is indeed missing and require a follow up page fault, the slow gup
> > > degrade will probably buried in the fault paths anyway. It also explains
> > > why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup:
> > > accelerate thp gup even for "pages != NULL"") lands, the latter not part of
> > > a performance analysis but a side benefit. If the performance will be a
> > > concern, we can consider handle CONT_PTE in follow_page().
> > >
> > > Before that is justified to be necessary, keep everything clean and simple.
> > >
> >
> > mm/gup.c:33:21: warning: 'follow_hugepd' declared 'static' but never defined [-Wunused-function]
> > 33 | static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> > | ^~~~~~~~~~~~~
> >
> > --- a/mm/gup.c~mm-gup-handle-hugepd-for-follow_page-fix
> > +++ a/mm/gup.c
> > @@ -30,10 +30,12 @@ struct follow_page_context {
> > unsigned int page_mask;
> > };
> >
> > +#ifdef CONFIG_HAVE_FAST_GUP
> > static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> > unsigned long addr, unsigned int pdshift,
> > unsigned int flags,
> > struct follow_page_context *ctx);
> > +#endif
> >
> > static inline void sanity_check_pinned_pages(struct page **pages,
> > unsigned long npages)
> > _
> >
> >
> > This looks inelegant.
> >
> > That's two build issues so far. Please be more expansive in the
> > Kconfig variations when testing. Especially when mucking with pgtable
> > macros.
>
> Andrew,
>
> Apologies for that, and also for a slightly late response. Yeah it's time
> I'll need my set of things to do serious build tests, and I'll at least
> start to cover a few error prone configs/archs in with that.
>
> I was trying to rely on the build bot in many of previous such cases, as
> that was quite useful to me to cover many build issues without investing my
> own test setups, but I think for some reason it retired and stopped working
> for a while. Maybe I shouldn't have relied on it at all.
>
> For this specific issue, I'm not sure if CONFIG_HAVE_FAST_GUP is proper?
> As follow_hugepd() is used in slow gup not fast. So maybe we can put that
> under CONFIG_MMU below that code (and I think we can drop "static" too as I
> don't think it's anything useful). My version of fixup attached at the end

the static is useful; below patch did pass on m68k but won't on
x86.. ignore that please.

> of email, and I verified it on m68k build.
>
> I do plan to post a small fixup series to fix these issues (so far it may
> contain 1 formal patch to touch up vmstat_item_print_in_thp, and 2 fixups
> where I'll mark the subject with "fixup!" properly). Either you can pick
> up below or you can wait for my small patchset, should be there either
> today or tomorrow.

I changed plan here too; I found more users of HPAGE_PMD_NR assuming it's
defined even if !CONFIG_MMU. That's weird, as CONFIG_MMU doesn't even
define PMD_SHIFT... To fix this I decided to use the old trick on using
BUILD_BUG() like it used to work before; frankly I don't know how that
didn't throw warnings, but i'll make sure it passes all known builds (ps: I
still haven't got my build harness ready, so that will still be limited but
should solve known issues).

In short: please wait for my fixup series. Thanks.

>
> Thanks,
>
> ===8<===
> diff --git a/mm/gup.c b/mm/gup.c
> index 4cd349390477..a2ed8203495a 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -30,11 +30,6 @@ struct follow_page_context {
> unsigned int page_mask;
> };
>
> -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> - unsigned long addr, unsigned int pdshift,
> - unsigned int flags,
> - struct follow_page_context *ctx);
> -
> static inline void sanity_check_pinned_pages(struct page **pages,
> unsigned long npages)
> {
> @@ -505,6 +500,12 @@ static inline void mm_set_has_pinned_flag(unsigned long *mm_flags)
> }
>
> #ifdef CONFIG_MMU
> +
> +struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd,
> + unsigned long addr, unsigned int pdshift,
> + unsigned int flags,
> + struct follow_page_context *ctx);
> +
> static struct page *no_page_table(struct vm_area_struct *vma,
> unsigned int flags, unsigned long address)
> {
> ===8<===
>
> --
> Peter Xu

--
Peter Xu