Re: [PATCH 2/3] slab: create barns for online memoryless nodes
From: Hao Li
Date: Thu Mar 19 2026 - 03:02:03 EST
On Wed, Mar 18, 2026 at 01:11:58PM +0100, Vlastimil Babka (SUSE) wrote:
> On 3/18/26 10:27, Hao Li wrote:
> > On Wed, Mar 11, 2026 at 09:25:56AM +0100, Vlastimil Babka (SUSE) wrote:
> >> Ming Lei has reported [1] a performance regression due to replacing cpu
> >> (partial) slabs with sheaves. With slub stats enabled, a large amount of
> >> slowpath allocations were observed. The affected system has 8 online
> >> NUMA nodes but only 2 have memory.
> >>
> >> For sheaves to work effectively on given cpu, its NUMA node has to have
> >> struct node_barn allocated. Those are currently only allocated on nodes
> >> with memory (N_MEMORY) where kmem_cache_node also exist as the goal is
> >> to cache only node-local objects. But in order to have good performance
> >> on a memoryless node, we need its barn to exist and use sheaves to cache
> >> non-local objects (as no local objects can exist anyway).
> >>
> >> Therefore change the implementation to allocate barns on all online
> >> nodes, tracked in a new nodemask slab_barn_nodes. Also add a cpu hotplug
> >> callback as that's when a memoryless node can become online.
> >>
> >> Change rcu_sheaf->node assignment to numa_node_id() so it's returned to
> >> the barn of the local cpu's (potentially memoryless) node, and not to
> >> the nearest node with memory anymore.
> >>
> >> Reported-by: Ming Lei <ming.lei@xxxxxxxxxx>
> >> Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/ [1]
> >> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@xxxxxxxxxx>
> >> ---
> >> mm/slub.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
> >> 1 file changed, 59 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index 609a183f8533..d8496b37e364 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> > [...]
> >>
> >> /*
> >> @@ -7597,7 +7648,7 @@ static int init_kmem_cache_nodes(struct kmem_cache *s)
> >> if (slab_state == DOWN || !cache_has_sheaves(s))
> >> return 1;
> >>
> >> - for_each_node_mask(node, slab_nodes) {
> >> + for_each_node_mask(node, slab_barn_nodes) {
> >> struct node_barn *barn;
> >>
> >> barn = kmalloc_node(sizeof(*barn), GFP_KERNEL, node);
> >> @@ -8250,6 +8301,7 @@ static int slab_mem_going_online_callback(int nid)
> >> * and barn initialized for the new node.
> >> */
> >> node_set(nid, slab_nodes);
> >> + node_set(nid, slab_barn_nodes);
> >
> > I had a somewhat related question here.
> >
> > During memory hotplug, we call node_set() on slab_nodes when memory is brought
> > online, but we do not seem to call node_clear() when memory is taken offline. I
> > was wondering what the reasoning behind this is.
>
> Probably nobody took the task the implement the necessary teardown.
>
> > That also made me wonder about a related case. If I am understanding this
> > correctly, even if all memory of a node has been offlined, slab_nodes would
> > still make it appear that the node has memory, even though in reality it no
> > longer does. If so, then in patch 3, the condition
> > "if (unlikely(!node_isset(numa_node, slab_nodes)))" in can_free_to_pcs() seems
> > would cause the object free path to skip sheaves.
>
> Maybe the condition should be looking at N_MEMORY then?
Yes, that's what I was thinking too.
I feel that, at least for the current patchset, this is probably a reasonable
approach.
>
> Also ideally we should be using N_NORMAL_MEMORY everywhere for slab_nodes.
> Oh we actually did, but give that up in commit 1bf47d4195e45.
Thanks, I hadn't realized that node_clear had actually existed before.
>
> Note in practice full memory offline of a node can only be achieved if it
> was all ZONE_MOVABLE and thus no slab allocations ever happened on it. But
> if it has only movable memory, it's practically memoryless for slab
> purposes.
That's a good point! I just realized that too.
> Maybe the condition should be looking at N_NORMAL_MEMORY then.
> That would cover the case when it became offline and also the case when it's
> online but with only movable memory?
Exactly, conceptually, N_NORMAL_MEMORY seems more precise than N_MEMORY. I took
a quick look through the code, though, and it seems that N_NORMAL_MEMORY hasn't
been fully handled in the hotplug code.
Given that, I think it makes sense to use N_MEMORY for now, and then switch to
N_NORMAL_MEMORY later once the handling there is improved.
>
> I don't know if with CONFIG_HAVE_MEMORYLESS_NODES it's possible that
> numa_mem_id() (the closest node with memory) would be ZONE_MOVABLE only.
> Maybe let's hope not, and not adjust that part?
>
I think that, in the CONFIG_HAVE_MEMORYLESS_NODES=y case, numa_mem_id() ends up
calling local_memory_node(), and the NUMA node it returns should be one that
can allocate slab memory. So the slab_node == numa_node check seems reasonable
to me.
So it seems that the issue being discussed here may only be specific to the
CONFIG_HAVE_MEMORYLESS_NODES=n case.
--
Thanks,
Hao