Re: [PATCH RFC 10/19] slab: remove cpu (partial) slabs usage from allocation paths

From: Vlastimil Babka

Date: Sat Jan 10 2026 - 10:41:17 EST


On 1/10/26 14:20, Chris Mason wrote:
> On 1/9/26 3:16 AM, Vlastimil Babka wrote:
>> On 10/24/25 16:29, Chris Mason wrote:
>>> On Thu, 23 Oct 2025 15:52:32 +0200 Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> [ ... ]
>
>> By the way, there was another bug in this patch, causing a severe memory
>> leak, which the AI unfortunately didn't flag. Petr reported it during
>> performance testing and it took me more than a day to find it. Oh well :)
>>
>> Wonder if things got better since then perhaps, and your or Roman's tools
>> would find it today? :)
>
> Yes and no. It didn't find the leak until I changed the prompt to say:
> "there is a leak, find it". I'll see if I can improve things...

Thanks. Hmm even if it has to be done like this, it could be a substantial
time saver vs finding the leak myself.

>> diff --git a/mm/slub.c b/mm/slub.c
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>
> [ ... ]
>
>> @@ -3592,54 +3549,53 @@ static void *get_partial_node(struct kmem_cache *s,
>> else if (!spin_trylock_irqsave(&n->list_lock, flags))
>> return NULL;
>> list_for_each_entry_safe(slab, slab2, &n->partial, slab_list) {
>> +
>> + struct freelist_counters old, new;
>> +
>> if (!pfmemalloc_match(slab, pc->flags))
>> continue;
>>
>> if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
>> - void *object = alloc_single_from_partial(s, n, slab,
>> + object = alloc_single_from_partial(s, n, slab,
>> pc->orig_size);
>> - if (object) {
>> - partial = slab;
>> - pc->object = object;
>> + if (object)
>> break;
>> - }
>> continue;
>> }
>>
>> + /*
>> + * get a single object from the slab. This might race against
>> + * __slab_free(), which however has to take the list_lock if
>> + * it's about to make the slab fully free.
>> + */
>> + do {
>> + old.freelist = slab->freelist;
>> + old.counters = slab->counters;
>> +
>> + new.freelist = get_freepointer(s, old.freelist);
>> + new.counters = old.counters;
>> + new.inuse++;
>> +
>> + } while (!__slab_update_freelist(s, slab, &old, &new, "get_partial_node"));
>> +
>> + object = old.freelist;
>> + if (!new.freelist)
>> + remove_partial(n, slab);
>
> Is there a missing break statement here? The debug path above breaks out
> of the loop after successfully allocating an object, but this non-debug
> path continues iterating through the partial list. Each iteration overwrites
> the object variable, so previously allocated objects would be leaked.
>
> The commit message says "Now we only want to return a single object" which
> matches the debug path behavior, but the non-debug path appears to allocate
> from every matching slab in the list.
>
>> }
>> spin_unlock_irqrestore(&n->list_lock, flags);
>> - return partial;
>> + return object;
>> }
>