Re: [PATCH v2] mm/slub: skip freelist construction for whole-slab bulk refill

From: hu.shengming

Date: Thu Apr 02 2026 - 03:08:52 EST

Harry wrote:
> On Wed, Apr 01, 2026 at 02:55:23PM +0800, Hao Li wrote:
> > On Wed, Apr 01, 2026 at 12:57:25PM +0800, hu.shengming@xxxxxxxxxx wrote:
> > > @@ -4395,6 +4458,48 @@ static unsigned int alloc_from_new_slab(struct kmem_cache *s, struct slab *slab,
> > > return allocated;
> > > }
> > >
> > > +static unsigned int alloc_whole_from_new_slab(struct kmem_cache *s,
> > > + struct slab *slab, void **p, bool allow_spin)
> > > +{
> > > +
> > > + unsigned int allocated = 0;
> > > + void *object, *start;
> > > +
> > > + if (alloc_whole_from_new_slab_random(s, slab, p, allow_spin,
> > > + &allocated)) {
> > > + goto done;
> > > + }
> > > +
> > > + start = fixup_red_left(s, slab_address(slab));
> > > + object = setup_object(s, start);
> > > +
> > > + while (allocated < slab->objects - 1) {
> > > + p[allocated] = object;
> > > + maybe_wipe_obj_freeptr(s, object);
> > > +
> > > + allocated++;
> > > + object += s->size;
> > > + object = setup_object(s, object);
> > > + }
> >
> > Also, I feel the current patch contains some duplicated code like this loop.
> >
> > Would it make sense to split allocate_slab() into two functions?
> >
> > For example,
> > the first part could be called allocate_slab_meta_setup() (just an example name)
> > And, the second part could be allocate_slab_objects_setup(), with the core logic
> > being the loop over objects. Then allocate_slab_objects_setup() could support
> > two modes: one called BUILD_FREELIST, which builds the freelist, and another
> > called EMIT_OBJECTS, which skips building the freelist and directly places the
> > objects into the target array.>

> Something similar but a little bit more thoughts to unify the code
> (**regardless of CONFIG_SLAB_FREELIST_RANDOM**) and avoid treating
> "the whole slab->freelist fits into the sheaf" as a special case:>

> - allocate_slab() no longer builds the freelist.
> the freelist is built only when there are objects left after
> allocating objects from the new slab.>

> - new_slab() allocates a new slab AND builds the freelist
> to keep existing behaviour.>

> - refill_objects() allocates a slab using allocate_slab(),
> and passes it to alloc_from_new_slab().>

> alloc_from_new_slab() consumes some objects in random order,
> and then build the freelist with the objects left (if exists).>

> We could actually abstract "iterating free objects in random order"
> into an API, and there would be two users of the API:
> - Building freelist
> - Filling objects into the sheaf (without building freelist!)>

> Something like this...
> (names here are just examples, I'm not good at naming things!)>

> struct freelist_iter {
> int pos;
> int freelist_count;
> int page_limit;
> void *start;
> };>

> /* note: handling !allow_spin nicely is tricky :-) */
> alloc_from_new_slab(...) {
> struct freelist_iter fit;>

> prep_freelist_iter(s, slab, &fit, allow_spin);
> while (slab->inuse < min(count, slab->objects)) {
> p[slab->inuse++] = next_freelist_entry(s, &fit);
> }>

> if (slab->inuse < slab->objects)
> build_freelist(s, slab, &fit);
> }>

> build_freelist(s, slab, fit) {
> size = slab->objects - slab->inuse;>

> cur = next_freelist_entry(s, fit);
> cur = setup_object(s, cur);
> slab->freelist = cur;
> for (i = 1; i < size; i++) {
> next = next_freelist_entry(s, fit);
> next = setup_object(s, next);
> set_freepointer(s, cur, next);
> cur = next;
> }
> }>

> #ifdef CONFIG_SLAB_FREELIST_RANDOM
> prep_freelist_iter(s, slab, fit, allow_spin) {
> fit->freelist_count = oo_objects(s->oo);
> fit->page_limit = slab->objects * s->size;
> fit->start = fixup_red_left(s, slab_address(slab));>

> if (slab->objects < 2 || !s->random_seq) {
> fit->pos = 0;
> } else if (allow_spin) {
> fit->pos = get_random_u32_below(freelist_count);
> } else {
> struct rnd_state *state;>

> /*
> * An interrupt or NMI handler might interrupt and change
> * the state in the middle, but that's safe.
> */
> state = &get_cpu_var(slab_rnd_state);
> fit->pos = prandom_u32_state(state) % freelist_count;
> put_cpu_var(slab_rnd_state);
> }>

> return;
> }
> next_freelist_entry(s, fit) {
> /*
> * If the target page allocation failed, the number of objects on the
> * page might be smaller than the usual size defined by the cache.
> */
> do {
> idx = s->random_seq[fit->pos];
> fit->pos += 1;
> if (fit->pos >= freelist_count)
> fit->pos = 0;
> } while (unlikely(idx >= page_limit));>

> return (char *)start + idx;
> }
> #else
> prep_freelist_iter(s, slab, fit, allow_spin) {
> fit->pos = 0;
> return;
> }
> next_freelist_entry(s, fit) {
> void *next = fit->start + fit->pos * s->size;>

> fit->pos++;
> return next;
> }
> #endif>

Hi Harry,

Thanks a lot for the detailed suggestion. This is a very good direction for
restructuring refill_objects().

I agree that abstracting the free-object iteration and making the flow uniform
regardless of CONFIG_SLAB_FREELIST_RANDOM is a cleaner approach than keeping
the “whole slab fits into the sheaf” case as a special path. Your idea of letting
alloc_from_new_slab() consume objects first and only build the freelist for the
remainder makes a lot of sense, and should also help reduce the duplicated
object-setup logic.

I’ll rework the patch along these lines, incorporating your and Hao suggestions,
and send a v3.

Thanks again for the thoughtful review.

--
With Best Regards,
Shengming

> --
> Cheers,
> Harry / Hyeonggon