Re: [PATCH] mm/slub: batch-detach node partial slabs
From: Hao Li
Date: Wed May 27 2026 - 02:09:20 EST
On Tue, May 26, 2026 at 03:41:55PM +0200, Vlastimil Babka (SUSE) wrote:
> On 5/26/26 9:37 AM, Harry Yoo wrote:
> >
> >
> > On 5/25/26 12:22 PM, Hao Li wrote:
> >> get_partial_node_bulk() used to move each selected slab from the node
> >> partial list to the local pc->slabs list using a remove_partial() and
> >> list_add() pair. In practice, the loop often detaches several adjacent
> >> slabs, so this repeatedly manipulates list pointers while holding
> >> n->list_lock, which causes unnecessary churn.
> >>
> >> Instead, track contiguous runs of matching slabs and move each run with
> >> list_bulk_move_tail() in one operation.
> >
> > TIL list_bulk_move_tail() :D
> >
> >> This reduces list pointer churn> inside the lock critical section.
>
> Nice!
>
> > Similar to this, can we return all slabs in pc->slabs at once when
> > returning those slabs to the list? ... I see Vlastimil removed 'nr of
> > empty slabs' check in the other series already.
> >
> > Now that it inserts slabs to the tail with Vlastimil's patchset, let's
> > do a list_splice_tail() instead?
>
> Why not, although it should be rare to return more than one slab in that
> path.
Yeah, it's definitely still worth doing.
> So in case this change gets tricky for some reason, we can leave it.
sure, I will try and test.
>
> BTW, if you also get the same idea as I had and try to replace the
> current "remove from pc.slabs initially, reinsert to pc.slabs if it was
> a partial refill" with a "keep in pc.slabs and only remove if the refill
> was full" to reduce pointer churn, don't try that, it crashes rather
> quickly :D
Ah, fair enough. But honestly, it did seem like a good idea initially.
>
> >> The mmap2 testcase shows a 5% improvement after applying this patch.
> >>
> >> Signed-off-by: Hao Li <hao.li@xxxxxxxxx>
> >> ---
> >> mm/slub.c | 22 ++++++++++++++++++----
> >> 1 file changed, 18 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index 04692a6f9128..180973a4a3d2 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> >> @@ -3775,15 +3783,21 @@ static bool get_partial_node_bulk(struct
> >> kmem_cache *s,
> >> && total_free + slab_free > pc->max_objects)
> >> break;
> >> - remove_partial(n, slab);
> >> -
> >> - list_add(&slab->slab_list, &pc->slabs);
> >> + if (!first)
> >> + first = slab;
> >> + last = slab;
> >
> >> + slab_clear_node_partial(slab);
> >> + n->nr_partial--;
> >
> > Perhaps factor out those two statements into to a common function and
> > call it in get_partial_node_bulk() and remove_partial()?
> >> total_free += slab_free;
> >> if (total_free >= pc->max_objects)
> >> break;
> >> }
> >> + if (first)
> >> + list_bulk_move_tail(&pc->slabs, &first->slab_list,
> >> + &last->slab_list);
> >> +
> >> spin_unlock_irqrestore(&n->list_lock, flags);
> >> return total_free > 0;
> >> }
> >
>
--
Thanks,
Hao