Re: [PATCH RFC 10/19] slab: remove cpu (partial) slabs usage from allocation paths

From: Harry Yoo

Date: Thu Oct 30 2025 - 00:33:32 EST

On Thu, Oct 23, 2025 at 03:52:32PM +0200, Vlastimil Babka wrote:
> We now rely on sheaves as the percpu caching layer and can refill them
> directly from partial or newly allocated slabs. Start removing the cpu
> (partial) slabs code, first from allocation paths.
>
> This means that any allocation not satisfied from percpu sheaves will
> end up in ___slab_alloc(), where we remove the usage of cpu (partial)
> slabs, so it will only perform get_partial() or new_slab().
>
> In get_partial_node() we used to return a slab for freezing as the cpu
> slab and to refill the partial slab. Now we only want to return a single
> object and leave the slab on the list (unless it became full). We can't
> simply reuse alloc_single_from_partial() as that assumes freeing uses
> free_to_partial_list(). Instead we need to use __slab_update_freelist()
> to work properly against a racing __slab_free().
>
> The rest of the changes is removing functions that no longer have any
> callers.
>
> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> ---
> mm/slub.c | 614 ++++++++------------------------------------------------------
> 1 file changed, 71 insertions(+), 543 deletions(-)

> diff --git a/mm/slub.c b/mm/slub.c
> index e2b052657d11..bd67336e7c1f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4790,66 +4509,15 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>
> stat(s, ALLOC_SLAB);
>
> - if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
> - freelist = alloc_single_from_new_slab(s, slab, orig_size, gfpflags);
> -
> - if (unlikely(!freelist))
> - goto new_objects;
> -
> - if (s->flags & SLAB_STORE_USER)
> - set_track(s, freelist, TRACK_ALLOC, addr,
> - gfpflags & ~(__GFP_DIRECT_RECLAIM));
> -
> - return freelist;
> - }
> -
> - /*
> - * No other reference to the slab yet so we can
> - * muck around with it freely without cmpxchg
> - */
> - freelist = slab->freelist;
> - slab->freelist = NULL;
> - slab->inuse = slab->objects;
> - slab->frozen = 1;
> -
> - inc_slabs_node(s, slab_nid(slab), slab->objects);
> + freelist = alloc_single_from_new_slab(s, slab, orig_size, gfpflags);
>
> - if (unlikely(!pfmemalloc_match(slab, gfpflags) && allow_spin)) {
> - /*
> - * For !pfmemalloc_match() case we don't load freelist so that
> - * we don't make further mismatched allocations easier.
> - */
> - deactivate_slab(s, slab, get_freepointer(s, freelist));
> - return freelist;
> - }
> + if (unlikely(!freelist))
> + goto new_objects;

We may end up in an endless loop in !allow_spin case?
(e.g., kmalloc_nolock() is called in NMI context and n->list_lock is
held in the process context on the same CPU)

Allocate a new slab, but somebody is holding n->list_lock, so trylock fails,
free the slab, goto new_objects, and repeat.

--
Cheers,
Harry / Hyeonggon