Re: [RFC PATCH v2] mm/vmalloc: fix incorrect __vmap_pages_range_noflush() if vm_area_alloc_pages() from high order fallback to order0
From: Barry Song
Date: Thu Jul 25 2024 - 05:35:16 EST
On Thu, Jul 25, 2024 at 9:17 PM Hailong Liu <hailong.liu@xxxxxxxx> wrote:
>
> On Thu, 25. Jul 18:21, Barry Song wrote:
> > On Thu, Jul 25, 2024 at 3:53 PM <hailong.liu@xxxxxxxx> wrote:
> [snip]
> >
> > This is still incorrect because it undoes Michal's work. We also need to break
> > the loop if (!nofail), which you're currently omitting.
>
> IIUC, the origin issue is to fix kvcalloc with __GFP_NOFAIL return NULL.
> https://lore.kernel.org/all/ZAXynvdNqcI0f6Us@xxxxxxxxxxxxxx/T/#u
> if we disable huge flag in kmalloc_node, the issue will be fixed.
No, this just bypasses kvmalloc and doesn't solve the underlying issue. Problems
can still be triggered by vmalloc_huge() even after the bypass. Once we
reorganize vmap_huge to support the combination of PMD and PTE
mapping, we should re-enable HUGE_VMAP for kvmalloc.
I would consider dropping VM_ALLOW_HUGE_VMAP() for kvmalloc as
an short-term "optimization" to save memory rather than a long-term fix. This
'optimization' is only valid until we reorganize HUGE_VMAP in a way
similar to THP. I mean, for a 2.1MB kvmalloc, we can map 2MB as PMD
and 0.1 as PTE.
> >
> > To avoid reverting Michal's work, the simplest "fix" would be,
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index caf032f0bd69..0011ca30df1c 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3775,7 +3775,7 @@ void *__vmalloc_node_range_noprof(unsigned long
> > size, unsigned long align,
> > return NULL;
> > }
> >
> > - if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP)) {
> > + if (vmap_allow_huge && (vm_flags & VM_ALLOW_HUGE_VMAP) &
> > !(gfp_mask & __GFP_NOFAIL)) {
> > unsigned long size_per_node;
> >
> > /*
> > >
> > > [1] https://lore.kernel.org/lkml/20240724182827.nlgdckimtg2gwns5@xxxxxxxx/
> > > 2.34.1
> >
> > Thanks
> > Barry
>
> --
> help you, help me,
> Hailong.