Re: [PATCH v2] mm/slub: skip freelist construction for whole-slab bulk refill

From: Hao Li

Date: Thu Apr 02 2026 - 05:06:28 EST


On Thu, Apr 02, 2026 at 05:12:23PM +0900, Harry Yoo (Oracle) wrote:
> On Thu, Apr 02, 2026 at 03:03:10PM +0800, hu.shengming@xxxxxxxxxx wrote:
> > Harry wrote:
> > > On Wed, Apr 01, 2026 at 02:55:23PM +0800, Hao Li wrote:
> > > > On Wed, Apr 01, 2026 at 12:57:25PM +0800, hu.shengming@xxxxxxxxxx wrote:
> > > > > @@ -4395,6 +4458,48 @@ static unsigned int alloc_from_new_slab(struct kmem_cache *s, struct slab *slab,
> > > > > return allocated;
> > > > > }
> > > > > +static unsigned int alloc_whole_from_new_slab(struct kmem_cache *s,
> > > > > + struct slab *slab, void **p, bool allow_spin)
> > > > > +{
> > > > > +
> > > > > + unsigned int allocated = 0;
> > > > > + void *object, *start;
> > > > > +
> > > > > + if (alloc_whole_from_new_slab_random(s, slab, p, allow_spin,
> > > > > + &allocated)) {
> > > > > + goto done;
> > > > > + }
> > > > > +
> > > > > + start = fixup_red_left(s, slab_address(slab));
> > > > > + object = setup_object(s, start);
> > > > > +
> > > > > + while (allocated < slab->objects - 1) {
> > > > > + p[allocated] = object;
> > > > > + maybe_wipe_obj_freeptr(s, object);
> > > > > +
> > > > > + allocated++;
> > > > > + object += s->size;
> > > > > + object = setup_object(s, object);
> > > > > + }
> > > >
> > > > Also, I feel the current patch contains some duplicated code like this loop.
> > > >
> > > > Would it make sense to split allocate_slab() into two functions?
> > > >
> > > > For example,
> > > > the first part could be called allocate_slab_meta_setup() (just an example name)
> > > > And, the second part could be allocate_slab_objects_setup(), with the core logic
> > > > being the loop over objects. Then allocate_slab_objects_setup() could support
> > > > two modes: one called BUILD_FREELIST, which builds the freelist, and another
> > > > called EMIT_OBJECTS, which skips building the freelist and directly places the
> > > > objects into the target array.>
> >
> > > Something similar but a little bit more thoughts to unify the code
> > > (**regardless of CONFIG_SLAB_FREELIST_RANDOM**) and avoid treating
> > > "the whole slab->freelist fits into the sheaf" as a special case:>
> >
> > > - allocate_slab() no longer builds the freelist.
> > > the freelist is built only when there are objects left after
> > > allocating objects from the new slab.>
> >
> > > - new_slab() allocates a new slab AND builds the freelist
> > > to keep existing behaviour.>
> >
> > > - refill_objects() allocates a slab using allocate_slab(),
> > > and passes it to alloc_from_new_slab().>
> >
> > > alloc_from_new_slab() consumes some objects in random order,
> > > and then build the freelist with the objects left (if exists).>
> >
> > > We could actually abstract "iterating free objects in random order"
> > > into an API, and there would be two users of the API:
> > > - Building freelist
> > > - Filling objects into the sheaf (without building freelist!)>
> >
> > > Something like this...
> > > (names here are just examples, I'm not good at naming things!)>
> >
> > > struct freelist_iter {
> > > int pos;
> > > int freelist_count;
> > > int page_limit;
> > > void *start;
> > > };>
> >
> > > /* note: handling !allow_spin nicely is tricky :-) */
> > > alloc_from_new_slab(...) {
> > > struct freelist_iter fit;>
> >
> > > prep_freelist_iter(s, slab, &fit, allow_spin);
> > > while (slab->inuse < min(count, slab->objects)) {
> > > p[slab->inuse++] = next_freelist_entry(s, &fit);
> > > }>
> >
> > > if (slab->inuse < slab->objects)
> > > build_freelist(s, slab, &fit);
> > > }>
> >
> > > build_freelist(s, slab, fit) {
> > > size = slab->objects - slab->inuse;>
> >
> > > cur = next_freelist_entry(s, fit);
> > > cur = setup_object(s, cur);
> > > slab->freelist = cur;
> > > for (i = 1; i < size; i++) {
> > > next = next_freelist_entry(s, fit);
> > > next = setup_object(s, next);
> > > set_freepointer(s, cur, next);
> > > cur = next;
> > > }
> > > }>
> >
> > > #ifdef CONFIG_SLAB_FREELIST_RANDOM
> > > prep_freelist_iter(s, slab, fit, allow_spin) {
> > > fit->freelist_count = oo_objects(s->oo);
> > > fit->page_limit = slab->objects * s->size;
> > > fit->start = fixup_red_left(s, slab_address(slab));>
> >
> > > if (slab->objects < 2 || !s->random_seq) {
> > > fit->pos = 0;
> > > } else if (allow_spin) {
> > > fit->pos = get_random_u32_below(freelist_count);
> > > } else {
> > > struct rnd_state *state;>
> >
> > > /*
> > > * An interrupt or NMI handler might interrupt and change
> > > * the state in the middle, but that's safe.
> > > */
> > > state = &get_cpu_var(slab_rnd_state);
> > > fit->pos = prandom_u32_state(state) % freelist_count;
> > > put_cpu_var(slab_rnd_state);
> > > }>
> >
> > > return;
> > > }
> > > next_freelist_entry(s, fit) {
> > > /*
> > > * If the target page allocation failed, the number of objects on the
> > > * page might be smaller than the usual size defined by the cache.
> > > */
> > > do {
> > > idx = s->random_seq[fit->pos];
> > > fit->pos += 1;
> > > if (fit->pos >= freelist_count)
> > > fit->pos = 0;
> > > } while (unlikely(idx >= page_limit));>
> >
> > > return (char *)start + idx;
> > > }
> > > #else
> > > prep_freelist_iter(s, slab, fit, allow_spin) {
> > > fit->pos = 0;
> > > return;
> > > }
> > > next_freelist_entry(s, fit) {
> > > void *next = fit->start + fit->pos * s->size;>
> >
> > > fit->pos++;
> > > return next;
> > > }
> > > #endif>
> >
> > Hi Harry,
> >
> > Thanks a lot for the detailed suggestion. This is a very good direction for
> > restructuring refill_objects().
> >
> > I agree that abstracting the free-object iteration and making the flow uniform
> > regardless of CONFIG_SLAB_FREELIST_RANDOM is a cleaner approach than keeping
> > the “whole slab fits into the sheaf” case as a special path.
> >
> > Your idea of letting alloc_from_new_slab() consume objects first and only
> > build the freelist for the remainder makes a lot of sense,
>

Hi, Harry and Shengming

I just finished understanding the new detailed code skeleton, and I think the
idea looks really promising. It seems to unify the handling of shuffle and
non-shuffle cases, while also naturally covering scenarios where a new slab is
not fully consumed. It looks very nice. Let's see how v3 performs going
forward.

> I believe Hao is working on trying to allow consuming all objects (from new
> and partial slabs) to fill sheaves when possible,
>

Yeah, I've had some work going on in that area too, and I'm trying to keep it
from stepping on the changes in the current patch :)

> but it'd still be nice to
> do this as long as it keeps the implementation simple.

Yes, exactly.

>
> > and should also help reduce the duplicated object-setup logic.
>
> Yeah, less code means less tokens, so better for the environm... oh wait,
> April Fools day is over! (just joking).

Oh, I didn't even realize yesterday was April Fools' Day. I guess my clock must
be broken :P

>
> > I’ll rework the patch along these lines, incorporating your and Hao suggestions,
> > and send a v3.
>
> Thanks for working on this.
>
> > Thanks again for the thoughtful review.
>
> No problem!
>
> --
> Cheers,
> Harry / Hyeonggon

--
Thanks,
Hao