Re: [PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path

From: Vlastimil Babka (SUSE)

Date: Thu Jun 04 2026 - 04:00:36 EST


On 6/4/26 07:46, Harry Yoo wrote:
>
>
> On 6/3/26 10:10 PM, hu.shengming@xxxxxxxxxx wrote:
>> From: Shengming Hu <hu.shengming@xxxxxxxxxx>
>>
>> _kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
>> initial allocation fails. The retry currently reuses `size` as the
>> bucket selector and overwrites it with s->object_size + 1.
>>
>> That value is later passed as the original allocation size to
>> __slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
>> successful retry this makes KASAN/slub-debug observe the retry bucket
>> selector rather than the caller requested size, potentially widening the
>> valid kmalloc range and hiding overflows.
>
> Good catch!
>
>> Keep a separate `bucket_size` for choosing the retry cache and preserve
>> `size`.
>>
>> Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
>> Signed-off-by: Shengming Hu <hu.shengming@xxxxxxxxxx>
>> ---
>
> I want to note that this conflicts with Vlastimil's work-in-progress
> feature [1] where it separately stores orig_size in struct
> slab_alloc_context which naturally solves the problem.

Thanks, didn't realize it was solving this problem as a side-effect, hehe.

> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags
>
> No strong opinion on whether to queue this patch and then rebase
> Vlastimil's work on top, or wait for Vlastimil's work to solve the
> problem instead...

It's better to queue fixes separately first, in case someone wants to
backport them, even though stable shouldn't be really necessary here.
This fix could still go to 7.2 but my series only to 7.3.

>> mm/slub.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 67abbbf68fc1..6a2b3ade3611 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
>> void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
>> {
>> gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
>> + size_t bucket_size = size;
>> struct kmem_cache *s;
>> bool can_retry = true;
>> void *ret;
>
> But in either way, I think it's more straightforward to introduce
> orig_size as a variable to keep the original size and pass it to
> __slab_alloc_node().

Agree. Because above, we wouldn't initialize bucket_size to a real bucket
size, but a <=bucket size so it's misleading.

>> @@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
>> return NULL;
>>
>> retry:
>> - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
>> + if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
>> return NULL;
>> - s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
>> + s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
>>
>> if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
>> /*
>> @@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
>> */
>> if (!ret && can_retry) {
>> /* pick the next kmalloc bucket */
>> - size = s->object_size + 1;
>> + bucket_size = s->object_size + 1;
>> /*
>> * Another alternative is to
>> * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
>