RE: [PATCH] slab: kmalloc_size_roundup() must not return 0 for non-zero size

From: David Laight
Date: Wed Sep 06 2023 - 05:15:03 EST


From: Vlastimil Babka
> Sent: 06 September 2023 09:48
>
> Please Cc: also R: folks in MAINTAINERS, adding them now.
>
> On 9/6/23 10:18, David Laight wrote:
> > The typical use of kmalloc_size_roundup() is:
> > ptr = kmalloc(sz = kmalloc_size_roundup(size), ...);
> > if (!ptr) return -ENOMEM.
> > This means it is vitally important that the returned value isn't
> > less than the argument even if the argument is insane.
> > In particular if kmalloc_slab() fails or the value is above
> > (MAX_ULONG - PAGE_SIZE) zero is returned and kmalloc() will return
> > it's single zero-length buffer.
> >
> > Fix by returning the input size on error or if the size exceeds
> > a 'sanity' limit.
> > kmalloc() will then return NULL is the size really is too big.
> >
> > Signed-off-by: David Laight <david.laight@xxxxxxxxxx>
> > Fixes: 05a940656e1eb ("slab: Introduce kmalloc_size_roundup()")
> > ---
> > The 'sanity limit' value doesn't really matter (even if too small)
> > It could be 'MAX_ORDER + PAGE_SHIFT' but one ppc64 has MAX_ORDER 16
> > and I don't know if that also has large pages.
>
> Well we do have KMALLOC_MAX_SIZE, which is based on MAX_ORDER + PAGE_SHIFT
> (and no issues on ppc64 so I'd expect the combination of MAX_ORDER and
> PAGE_SHIFT should always be such that it doesn't overflow on the particular
> arch) so I think it would be the most straightforward to simply use that.

Searching all the consta
nts gets hard :-)
>
> > Maybe it could be 1ul << 30 on 64bit, but it really doesn't matter
> > if it is too big.
> >
> > The original patch also added kmalloc_size_roundup() to mm/slob.c
> > that can also round up a value to zero - but has since been removed.
> >
> > mm/slab_common.c | 29 ++++++++++++++---------------
> > 1 file changed, 14 insertions(+), 15 deletions(-)
> >
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index cd71f9581e67..8418eccda8cf 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -747,22 +747,21 @@ size_t kmalloc_size_roundup(size_t size)
> > {
> > struct kmem_cache *c;
> >
> > - /* Short-circuit the 0 size case. */
> > - if (unlikely(size == 0))
> > - return 0;
> > - /* Short-circuit saturated "too-large" case. */
> > - if (unlikely(size == SIZE_MAX))
> > - return SIZE_MAX;
> > - /* Above the smaller buckets, size is a multiple of page size. */
> > - if (size > KMALLOC_MAX_CACHE_SIZE)
> > - return PAGE_SIZE << get_order(size);
> > + if (size && size <= KMALLOC_MAX_CACHE_SIZE) {
>
> I guess the whole test could all be likely().
>
> Also this patch could probably be just replacing the SIZE_MAX test with >=
> KMALLOC_MAX_SIZE, but since the majority is expected to be 0 < size <=
> KMALLOC_MAX_CACHE_SIZE, your reordering makes sense to me.

I re-ordered it because there are three cases and it didn't make any sense
to have two of them inside the first conditional.
In particular it made the comments easier to write!

Optimising for 'size == 0' (the unlikely() makes little difference)
is completely insane.
The compiler optimises 'if (size && size <= constant)' to
'if ((size - 1) < constant)' so you lose a conditional branch
in the hot paths

> > + /*
> > + * The flags don't matter since size_index is common to all.
> > + * Neither does the caller for just getting ->object_size.
> > + */
> > + c = kmalloc_slab(size, GFP_KERNEL, 0);
> > + return likely(c) ? c->object_size : size;
> > + }
> >
> > - /*
> > - * The flags don't matter since size_index is common to all.
> > - * Neither does the caller for just getting ->object_size.
> > - */
> > - c = kmalloc_slab(size, GFP_KERNEL, 0);
> > - return c ? c->object_size : 0;
> > + /* Return 'size' for 0 and very large - kmalloc() may fail. */
> > + if (unlikely((size - 1) >> (sizeof (long) == 8 ? 34 : 30)))
>
> So I'd just test for size == 0 || size > KMALLOC_MAX_SIZE?

That does generate better code, but on some arch adding 1
might make the constant (eg) 0x400000 not 0x3fffff and easier
to generate.

Actually, for consistency, it might be best to invert the
second compare as well.

I'll edit the patch to generate a v2 later.

David

>
> > + return size;
> > +
> > + /* Above the smaller buckets, size is a multiple of page size. */
> > + return PAGE_SIZE << get_order(size);
> > }
> > EXPORT_SYMBOL(kmalloc_size_roundup);
> >

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)