Re: Subject: [PATCH v2] slab: kmalloc_size_roundup() must not return 0 for non-zero size

From: Vlastimil Babka
Date: Mon Sep 11 2023 - 17:18:28 EST


On 9/11/23 18:12, David Laight wrote:
> From: Vlastimil Babka
>> Sent: 11 September 2023 16:54
>>
>> On 9/8/23 10:26, David Laight wrote:
>> > From: Kees Cook
>> >> Sent: 07 September 2023 20:38
>> >>
>> >> On Thu, Sep 07, 2023 at 12:42:20PM +0000, David Laight wrote:
>> >> > The typical use of kmalloc_size_roundup() is:
>> >> > ptr = kmalloc(sz = kmalloc_size_roundup(size), ...);
>> >> > if (!ptr) return -ENOMEM.
>> >> > This means it is vitally important that the returned value isn't
>> >> > less than the argument even if the argument is insane.
>> >> > In particular if kmalloc_slab() fails or the value is above
>> >> > (MAX_ULONG - PAGE_SIZE) zero is returned and kmalloc() will return
>> >> > it's single zero-length buffer.
>> >> >
>> >> > Fix by returning the input size on error or if the size exceeds
>> >> > a 'sanity' limit.
>> >> > kmalloc() will then return NULL is the size really is too big.
>> >> >
>> >> >
>> >> > Signed-off-by: David Laight <david.laight@xxxxxxxxxx>
>> >> > Fixes: 05a940656e1eb ("slab: Introduce kmalloc_size_roundup()")
>> >> > ---
>> >> > v2:
>> >> > - Use KMALLOC_MAX_SIZE for upper limit.
>> >> > (KMALLOC_MAX_SIZE + 1 may give better code on some archs!)
>> >> > - Invert test for overlarge for consistency.
>> >> > - Put a likely() on result of kmalloc_slab().
>> >> >
>> >> > mm/slab_common.c | 26 +++++++++++++-------------
>> >> > 1 file changed, 13 insertions(+), 13 deletions(-)
>> >> >
>> >> > diff --git a/mm/slab_common.c b/mm/slab_common.c
>> >> > index cd71f9581e67..0fb7c7e19bad 100644
>> >> > --- a/mm/slab_common.c
>> >> > +++ b/mm/slab_common.c
>> >> > @@ -747,22 +747,22 @@ size_t kmalloc_size_roundup(size_t size)
>> >> > {
>> >> > struct kmem_cache *c;
>> >> >
>> >> > - /* Short-circuit the 0 size case. */
>> >> > - if (unlikely(size == 0))
>> >> > - return 0;
>> >>
>> >> If we want to allow 0, let's just leave this case as-is: the compiler
>> >> will optimize it against the other tests.
>> >
>> > I doubt the compiler will optimise it away - especially with
>> > the unlikely().
>>
>> Yeah I also think compiler can't do much optimizations except for build-time
>> constant 0 here.
>
> Only relevant if the code were inlined - and it isn't.

Aha, I thought it was, good point.

> (and is probably a bit big.)
> I'm not sure you'd want to expose kmalloc_slab() to the wider kernel.

No, let's keep it that way.

> OTOH, it could have an inline version for constants > KMALLOC_CACHE_SIZE.
> But they may not happen often enough to make any difference.

Yeah, unnecessary.

>>
>> > OTOH the explicit checks for (size && size <= LIMIT) do
>> > get optimised to ((size - 1) <= LIMIT - 1) so become
>> > a single compare.
>> >
>> > Then returning 'size' at the bottom means that zero is returned
>> > in the arg is zero - which is fine.
>> >
>> >>
>> >> > - /* Short-circuit saturated "too-large" case. */
>> >> > - if (unlikely(size == SIZE_MAX))
>> >> > - return SIZE_MAX;
>> >> > + if (size && size <= KMALLOC_MAX_CACHE_SIZE) {
>> >> > + /*
>> >> > + * The flags don't matter since size_index is common to all.
>> >> > + * Neither does the caller for just getting ->object_size.
>> >> > + */
>> >> > + c = kmalloc_slab(size, GFP_KERNEL, 0);
>> >> > + return likely(c) ? c->object_size : size;
>> >>
>> >> I would like to have this fail "safe". c should never be NULL here, so
>> >> let's return "KMALLOC_MAX_SIZE + 1" to force failures.
>> >
>> > Why even try to force failure here?
>> > The whole function is just an optimisation so that the caller
>> > can use the spare space.
>> >
>> > The only thing it mustn't do is return a smaller value.
>>
>> If "c" is NULL it means either the kernel build must be broken e.g. by
>> somebody breaking the KMALLOC_MAX_CACHE_SIZE value, and we could just ignore
>> c being NULL and let it crash because of that.
>> But I think it can also be NULL due to trying to call kmalloc_size_roundup()
>> too early, when kmalloc_caches array is not yet populated. Note if we call
>> kmalloc() itself too early, we get a NULL as a result, AFAICS. I can imagine
>> two scenarios:
>>
>> - kmalloc_size_roundup() is called with result immediately fed to kmalloc()
>> that happens too early, in that case we best should not crash on c being
>> NULL and make sure the kmalloc() returns NULL.
>> - kmalloc_size_roundup() is called in some init code to get a value that
>> some later kmalloc() call uses. We might want also not crash in that case,
>> but informing the developer that they did something wrong would be also useful?
>>
>> Clearly returning 0 if c == NULL, as done currently, is wrong for both
>> scenarios. Retuning "size" is OK for the first scenario, also valid for the
>> second one, but the caller will silently lose the benefit of
>> kmalloc_size_roundup() and the developer introducing that won't realize it's
>> done too early and could be fixed.
>
> I'm sure that won't matter.

For the performance, sure. It just feels silly to me to have a code that
looks like it does something, but silently doesn't. Leads to cargo cult
copying it to other places etc.

>> So perhaps the best would be to return size for c == NULL, but also do a
>> WARN_ONCE?
>
> That would add a real function call to an otherwise leaf function
> and almost certainly require the compiler create a stack frame.

Hm I thought WARN is done by tripping on undefined instruction like BUG
these days. Also any code that accepts the call to kmalloc_size_roundup
probably could accept that too.

>
> ...
>
> I did have an interesting 'lateral thought' idea.
> It is all very silly doing all the work twice, what you really
> want is kmalloc() to return both the pointer and actual size.
> But returning a 'two word' structure is done by reference and
> would kill performance/
> OTOH a lot of archs can return two word integers in a register pair
> (dx:ax on x86).
> Could you have the real function return ((unsigned __int64)size << 64 | (long)ptr)
> and then extract the size in a wrapper macro?
> (With different types for 32bit)
>
> That will, of course, break the 'it's like malloc' checks the
> compiler is doing - unless it is taught what is going on.

Probably this is something not worth all the trouble.

> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)