Re: [PATCH RFC 06/19] slab: introduce percpu sheaves bootstrap

From: Chris Mason

Date: Fri Oct 24 2025 - 11:33:18 EST


On Thu, 23 Oct 2025 15:52:28 +0200 Vlastimil Babka <vbabka@xxxxxxx> wrote:

> Until now, kmem_cache->cpu_sheaves was !NULL only for caches with
> sheaves enabled. Since we want to enable them for almost all caches,
> it's suboptimal to test the pointer in the fast paths, so instead
> allocate it for all caches in do_kmem_cache_create(). Instead of testing
> the cpu_sheaves pointer to recognize caches (yet) without sheaves, test
> kmem_cache->sheaf_capacity for being 0, where needed.
>
> However, for the fast paths sake we also assume that the main sheaf
> always exists (pcs->main is !NULL), and during bootstrap we cannot
> allocate sheaves yet.
>
> Solve this by introducing a single static bootstrap_sheaf that's
> assigned as pcs->main during bootstrap. It has a size of 0, so during
> allocations, the fast path will find it's empty. Since the size of 0
> matches sheaf_capacity of 0, the freeing fast paths will find it's
> "full". In the slow path handlers, we check sheaf_capacity to recognize
> that the cache doesn't (yet) have real sheaves, and fall back. Thus
> sharing the single bootstrap sheaf like this for multiple caches and
> cpus is safe.
>

Hi Vlastimil,

We're trying out the AI patch review automation on the BPF commits and it had
some questions about a few of these. Since the goal is to actually test the
automation, I'm putting them in unedited, but I did try and make sure they
were valid before sending.

> diff --git a/mm/slub.c b/mm/slub.c
> index a6e58d3708f4..ecb10ed5acfe 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c

[ ... ]

> @@ -7733,8 +7782,7 @@ static void free_kmem_cache_nodes(struct kmem_cache *s)
> void __kmem_cache_release(struct kmem_cache *s)
> {
> cache_random_seq_destroy(s);
> - if (s->cpu_sheaves)
> - pcs_destroy(s);
> + pcs_destroy(s);

Can pcs_destroy() handle a NULL cpu_sheaves pointer? Looking at the
error path in do_kmem_cache_create():

s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
if (!s->cpu_sheaves) {
err = -ENOMEM;
goto out;
}
...
out:
if (err)
__kmem_cache_release(s);

If alloc_percpu() fails, __kmem_cache_release() now calls pcs_destroy()
unconditionally. In pcs_destroy(), the first operation is:

for_each_possible_cpu(cpu) {
pcs = per_cpu_ptr(s->cpu_sheaves, cpu);

Does per_cpu_ptr() dereference s->cpu_sheaves when it's NULL?

> #ifdef CONFIG_PREEMPT_RT
> if (s->cpu_slab)
> lockdep_unregister_key(&s->lock_key);

[ ... ]

> @@ -8608,12 +8656,10 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
>
> set_cpu_partial(s);
>
> - if (s->sheaf_capacity) {
> - s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
> - if (!s->cpu_sheaves) {
> - err = -ENOMEM;
> - goto out;
> - }
> + s->cpu_sheaves = alloc_percpu(struct slub_percpu_sheaves);
> + if (!s->cpu_sheaves) {
> + err = -ENOMEM;
> + goto out;
> }

This error path triggers the call chain: do_kmem_cache_create() error
path -> __kmem_cache_release() -> pcs_destroy() with NULL cpu_sheaves.