Re: [PATCH v2 15/16] mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts()
From: Hao Li
Date: Fri Jun 12 2026 - 07:30:32 EST
On Fri, Jun 12, 2026 at 12:17:45PM +0200, Vlastimil Babka (SUSE) wrote:
> On 6/12/26 08:54, Hao Li wrote:
> > On Wed, Jun 10, 2026 at 05:40:17PM +0200, Vlastimil Babka (SUSE) wrote:
> >> __GFP_NO_OBJ_EXT has limited scope within the slab allocator itself and
> >> gfp flags are a scarce resource, unlike slab's alloc_flags.
> >>
> >> Introduce SLAB_ALLOC_NO_RECURSE alloc flag that has the same intent as
> >> __GFP_NO_OBJ_EXT but a more generic name, meaning that a kmalloc()
> >> family function should not recurse into another kmalloc*() for the
> >> purposes of allocating auxiliary structures (obj_ext arrays or sheaves).
> >>
> >> First, replace the __GFP_NO_OBJ_EXT for allocating obj_ext arrays in
> >> alloc_slab_obj_exts(). Make use of the newly added kmalloc_flags()
> >> function, where we can pass alloc_flags with SLAB_ALLOC_NO_RECURSE
> >> added. This will also pass through SLAB_ALLOC_TRYLOCK so we don't need
> >> to special case kmalloc_nolock() anymore.
> >>
> >> Note that until now the kmalloc_nolock() ignored the incoming gfp flags
> >> and hardcoded __GFP_ZERO | __GFP_NO_OBJ_EXT. But it's correct to pass on
> >> the incoming gfp flags (only augmented with __GFP_ZERO), because if
> >> alloc_flags contain SLAB_ALLOC_TRYLOCK, the incoming gfp flags have to
> >> be also compatible with it.
> >>
> >> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@xxxxxxxxxx>
> >> ---
> >> mm/slab.h | 1 +
> >> mm/slub.c | 13 +++++--------
> >> 2 files changed, 6 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/mm/slab.h b/mm/slab.h
> >> index 45bfcfb35a9c..509f330654b8 100644
> >> --- a/mm/slab.h
> >> +++ b/mm/slab.h
> >> @@ -21,6 +21,7 @@
> >> #define SLAB_ALLOC_DEFAULT 0x00 /* no flags */
> >> #define SLAB_ALLOC_TRYLOCK 0x01 /* a kmalloc_nolock() allocation */
> >> #define SLAB_ALLOC_NEW_SLAB 0x02 /* a flag for alloc_slab_obj_exts() */
> >> +#define SLAB_ALLOC_NO_RECURSE 0x04 /* prevent kmalloc() recursion */
> >>
> >> static inline bool alloc_flags_allow_spinning(const unsigned int alloc_flags)
> >> {
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index cbb38bd01e46..7dfbd0251aa2 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> >> @@ -2167,15 +2167,12 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> >>
> >> gfp &= ~OBJCGS_CLEAR_MASK;
> >> /* Prevent recursive extension vector allocation */
> >> - gfp |= __GFP_NO_OBJ_EXT;
> >> + alloc_flags |= SLAB_ALLOC_NO_RECURSE;
> >>
> >> sz = obj_exts_alloc_size(s, slab, gfp);
> >>
> >
> > For the original calls to kmalloc_nolock and kmalloc_node, I notice a difference:
> >
> >> - if (unlikely(!allow_spin))
> >> - vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
> >> - slab_nid(slab));
> >
> > kmalloc_nolock completely discarded `gfp` flags.
> >
> >> - else
> >> - vec = kmalloc_node(sz, gfp | __GFP_ZERO, slab_nid(slab));
> >
> > while kmalloc_node preserved and passed it along.
> >
> >> + /* This will use kmalloc_nolock() if alloc_flags say so */
> >> + vec = kmalloc_flags(sz, gfp | __GFP_ZERO, alloc_flags, slab_nid(slab));
> >
> > Now both paths are merged into kmalloc_flags, the gfp flags are
> > unconditionally carried through. It seems this might carry some unwanted flags.
> >
> > I traced the call path and found that ___slab_alloc sets the __GFP_THISNODE
> > for trynode_flags. If this flag propagates all the way into
> > kmalloc_flags->...->__kmalloc_nolock_noprof, it will trigger the
> > VM_WARN_ON_ONCE warning. Maybe we need to strip the original gfp if
> > `!allow_spin`.
>
> Thanks. This should do the job in a more generic way I hope?
>
Yeah, this is more elegant.
> diff --git a/mm/slub.c b/mm/slub.c
> index f9b8dc56bb57..0bf53f70c9be 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2047,12 +2047,15 @@ static inline void dec_slabs_node(struct kmem_cache *s, int node,
> #endif /* CONFIG_SLUB_DEBUG */
>
> /*
> - * The allocated objcg pointers array is not accounted directly.
> + * The allocated objcg pointers array or sheaf is not accounted directly.
> * Moreover, it should not come from DMA buffer and is not readily
> - * reclaimable. So those GFP bits should be masked off.
> + * reclaimable. Node restriction for the parent allocation also should
> + * not apply to the slab's internal objects.
> + * So those GFP bits should be masked off.
> */
> #define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | \
> - __GFP_ACCOUNT | __GFP_NOFAIL)
> + __GFP_ACCOUNT | __GFP_NOFAIL |
> + __GFP_THISNODE )
Good idea! Both code and comments make sense to me.
>
> #ifdef CONFIG_SLAB_OBJ_EXT
>
>
--
Thanks,
Hao