Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP
From: Luis Chamberlain
Date: Fri Apr 15 2022 - 22:20:41 EST
On Fri, Apr 15, 2022 at 06:34:16PM -0700, Song Liu wrote:
> On Fri, Apr 15, 2022 at 12:05 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
> >
> > On Fri, Apr 15, 2022 at 09:44:09AM -0700, Song Liu wrote:
> > > Changes v3 => v4:
> > > 1. Fix __weak module_alloc_huge; remove unused vmalloc_huge; rename
> > > __vmalloc_huge => vmalloc_huge. (Christoph Hellwig)
> > > 2. Use vzalloc (as it was before vmalloc_no_huge) and clean up comments in
> > > kvm_s390_pv_alloc_vm.
> > >
> > > Changes v2 => v3:
> > > 1. Use __vmalloc_huge in alloc_large_system_hash.
> > > 2. Use EXPORT_SYMBOL_GPL for new functions. (Christoph Hellwig)
> > > 3. Add more description about the issues and changes.(Christoph Hellwig,
> > > Rick Edgecombe).
> > >
> > > Changes v1 => v2:
> > > 1. Add vmalloc_huge(). (Christoph Hellwig)
> > > 2. Add module_alloc_huge(). (Christoph Hellwig)
> > > 3. Add Fixes tag and Link tag. (Thorsten Leemhuis)
> > >
> > > Enabling HAVE_ARCH_HUGE_VMALLOC on x86_64 and use it for bpf_prog_pack has
> > > caused some issues [1], as many users of vmalloc are not yet ready to
> > > handle huge pages. To enable a more smooth transition to use huge page
> > > backed vmalloc memory, this set replaces VM_NO_HUGE_VMAP flag with an new
> > > opt-in flag, VM_ALLOW_HUGE_VMAP. More discussions about this topic can be
> > > found at [2].
> > >
> > > Patch 1 removes VM_NO_HUGE_VMAP and adds VM_ALLOW_HUGE_VMAP.
> > > Patch 2 uses VM_ALLOW_HUGE_VMAP in bpf_prog_pack.
> > >
> > > [1] https://lore.kernel.org/lkml/20220204185742.271030-1-song@xxxxxxxxxx/
> > > [2] https://lore.kernel.org/linux-mm/20220330225642.1163897-1-song@xxxxxxxxxx/
> >
> > Looks good except for that I think this should just wait for v5.19. The
> > fixes are so large I can't see why this needs to be rushed in other than
> > the first assumptions of the optimizations had some flaws addressed here.
>
> We need these changes to fix issues like [3]. Note that there might
> still be some
> undiscovered issues with huge page backed vmalloc memory on powerpc, which
> had HAVE_ARCH_HUGE_VMALLOC enabled since the 5.15 kernel. As we
> agreed, the new opt-in flag is a safer approach here. We probably should have
> 1/4 and 2/4 back ported to stable. Therefore, I think shipping this
> set now would
> give us a more reliable 5.18 release.
>
> Does this make sense?
Yes absolutely, but that sounds like that optimization should just be
reverted completely from v5.18 isntead then no? Or if one can skip the
optimizations for v5.18 with a small patch, and then enable for v5.19.
Luis