Re: [PATCH 04/13] mm: Use array_size() helpers for kmalloc()

From: Rasmus Villemoes
Date: Wed May 09 2018 - 14:00:41 EST


On 2018-05-09 13:34, Matthew Wilcox wrote:
> On Tue, May 08, 2018 at 05:42:20PM -0700, Kees Cook wrote:
>> @@ -499,6 +500,8 @@ static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
>> */
>> static __always_inline void *kmalloc(size_t size, gfp_t flags)
>> {
>> + if (size == SIZE_MAX)
>> + return NULL;
>> if (__builtin_constant_p(size)) {
>> if (size > KMALLOC_MAX_CACHE_SIZE)
>> return kmalloc_large(size, flags);
>
> I don't like the add-checking-to-every-call-site part of this patch.
> Fine, the compiler will optimise it away if it can calculate it at compile
> time, but there are a lot of situations where it can't. You aren't
> adding any safety by doing this; trying to allocate SIZE_MAX bytes is
> guaranteed to fail, and it doesn't need to fail quickly.

Yeah, agree that we don't want to add a size check to all callers,
including those where the size doesn't even come from one of the new
*_size helpers; that just adds bloat. It's true that the overflow case
does not have to fail quickly, but I was worried that the saturating
helpers would end up making gcc emit a cmov instruction, thus stalling
the regular path. But it seems that it actually ends up doing a forward
jump, sets %rdi to SIZE_MAX, then jumps back to the call of __kmalloc,
so it should be ok.

With __builtin_constant_p(size) && size == SIZE_MAX, gcc could be smart
enough to elide those two instructions and have the jo go directly to
the caller's error handling, but at least gcc 5.4 doesn't seem to be
that smart. So let's just omit that part for now.

But in case of the kmalloc_array functions, with a direct call of
__builtin_mul_overflow(), gcc does combine the "return NULL" with the
callers error handling, thus avoiding the six byte "%rdi = -1; jmp
back;" thunk. That, along with the churn factor, might be an argument
for leaving the current callers of *_array alone. But if we are going to
keep those longer-term, we might as well convert kmalloc(a, b) into
kmalloc_array(a, b) instead of kmalloc(array_size(a, b)). In any case, I
do see the usefulness of the struct_size helper, and agree that we
definitely should not introduce a new *_struct variant that needs to be
implemented in all families.

>> @@ -624,11 +629,13 @@ int memcg_update_all_caches(int num_memcgs);
>> */
>> static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags)
>> {
>> - if (size != 0 && n > SIZE_MAX / size)
>> + size_t bytes = array_size(n, size);
>> +
>> + if (bytes == SIZE_MAX)
>> return NULL;
>> if (__builtin_constant_p(n) && __builtin_constant_p(size))
>> - return kmalloc(n * size, flags);
>> - return __kmalloc(n * size, flags);
>> + return kmalloc(bytes, flags);
>> + return __kmalloc(bytes, flags);
>> }
>>
>> /**
>> @@ -639,7 +646,9 @@ static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags)
>> */
>> static inline void *kcalloc(size_t n, size_t size, gfp_t flags)
>> {
>> - return kmalloc_array(n, size, flags | __GFP_ZERO);
>> + size_t bytes = array_size(n, size);
>> +
>> + return kmalloc(bytes, flags | __GFP_ZERO);
>> }
>
> Hmm. I wonder why we have the kmalloc/__kmalloc "optimisation"
> in kmalloc_array, but not kcalloc. Bet we don't really need it in
> kmalloc_array. I'll do some testing.
>

Huh? Because kcalloc is implemented in terms of kmalloc_array? And can't
we just keep it that way?

Rasmus