Re: kalloc_objs() may not be as safe as it seems
From: Alejandro Colomar
Date: Wed Mar 18 2026 - 12:52:58 EST
Hi Kees,
On 2026-03-17T22:40:53+0100, Alejandro Colomar wrote:
> On 2026-03-17T14:14:11-0700, Kees Cook wrote:
> > On Mon, Mar 16, 2026 at 07:33:34PM +0100, Alejandro Colomar wrote:
> > > I just learnt about kalloc_objs() et al. from
> > > <https://lwn.net/Articles/1062856/>.
> > >
> > > ptr = kmalloc_obj(*ptr);
> > > ptr = kmalloc_objs(*ptr, n);
> > >
> > > This resembles a lot the macro we have in shadow-utils, malloc_T(),
> > > which would be used as (to resemble the above):
> > >
> > > ptr = malloc_T(1, typeof(*ptr)); // But we'd really pass the type
> > > ptr = malloc_T(n, typeof(*ptr)); // But we'd really pass the type
> > >
> > > But I've noticed some design mistakes that make it not as safe as it
> > > seems.
> > >
> > > Default arguments
> > > ~~~~~~~~~~~~~~~~~
> > >
> > > I tend to think it's simpler to have a single API that works for both
> > > 1 element and multiple elements. The special-casing of 1 element seems
> > > unnecessary, and having a literal 1 works just fine.
> >
> > This is a reasonable opinion, but not one I wanted to try to fight for
> > with the Linux developer community. Linus has tended to frown on adding
> > any new burden when making these kinds of changes, and expecting
> > everyone to add a (seemingly) redundant "1" to all API calls (or an
> > empty comma) seems unlikely to fly.
>
> Should we add Linus to CC? I think the safety arguments, especially
> after his patch adding the default argument, talk strongly about using
> the literal 1, and dropping the _obj() macros. I hope the "1" is
> something people could consider.
>
> I understand the empty comma might be more controversial; although
> I hope we can eventually discuss it too.
>
> > > I think the combination of having the macros be variadic (for gfp) with
> > > having two very similar APIs that differ in number of arguments, and
> > > all those arguments being integer types, is prone to errors. Consider
> > > the case where one would accidentally write
> > >
> > > ptr = kmalloc_obj(*ptr, n); // Bogus
> > > instead of
> > > ptr = kmalloc_objs(*ptr, n);
> >
> > This loss of GFP flags wasn't part of my original design, and I would
> > agree that given the lack of type checking for GFP flags, this does look
> > like a potential foot-gun.
>
> Indeed; I agree Linus's change was the worst part.
>
> Having just one of these two issues separately would be relatively fine,
> and it's the combination of both that makes it such a foot-gun. From
> the two issues, the variadic argument list is probably the most
> dangerous part.
>
> > > The compiler wouldn't realize at all. That's a strong argument in
> > > favour of having default arguments be required to be explicit, with an
> > > empty argument:
> > >
> > > ptr = kmalloc_obj(*ptr,);
> > > ptr = kmalloc_objs(*ptr, n,);
> > >
> > > I know you (and Linus too, FWIW) have previously claimed that it looks
> > > weird to the eye. But I'm pretty sure you could get used to it. That's
> > > certainly going to be safer.
> > >
> > > With mandatory empty arguments, the compiler would easily distinguish
> > > mistakes like the one above.
> >
> > I'd rather we get something like the __strict typedef so "gfp_t" would
> > be a true separate type, not just a silent alias of "int".
How about using a struct? That's the idiomatic way of having
incompatible types. Since these are used through macros, it wouldn't
change anything to users.
The macro definitions would have to change. For example:
-#define GFP_NOFS (__GFP_RECLAIM | __GFP_IO)
+#define GFP_NOFS ((void)0, (struct gfp){__GFP_RECLAIM | __GFP_IO})
We don't need any new language features. This is backwards compatible
up to C99. And it might end up being simpler than __strict typedef.
> Yup, that would help too. Although in general I still think that
> variadic argument lists are very dangerous: there's no difference
> between "the programmer forgot to specify the argument, and the compiler
> didn't remind them" and "the programmer wanted the default value". This
> is a source of long-term issues.
>
> I understand the extra comma might be difficult to sell at the moment,
> but we should try to figure out a way to sell it in the long term.
>
> > > Type safety
> > > ~~~~~~~~~~~
> > >
> > > Apart from the issues with the above, the ability to pass a variable
> > > instead of a type name is also a bad choice. In shadow-utils, we
> >
> > This was a first-order requirement, or we'd never be able to refactor
> > the codebase to use the new API. There was already a heavy mixture of
> > types and variables used within sizeof(), and trying to take that part
> > and find types exceeded Coccinelle's abilities.
> >
> > Making the refactor "trivial" was important; but see below.
>
> Sounds reasonable. This imperfect API is still (or was, until Linus
> turned it into a double foot-gun) better than what was before, and if it
> serves as a step to eventually reach the better APIs, it's great.
After sleeping, I had some idea.
We could have coccinelle add typeof() around the first parameter when
it's an expression (not a type). Then, we could enforce that the first
parameter is a type name.
That is:
p = kmalloc_objs(int, 42); // ok
-q = kmalloc_objs(*p, 7);
+q = kmalloc_objs(typeof(*p), 7);
I expect this would be doable with coccinelle.
Then, new code would be required to pass a type name. And people could
slowly replace the existing typeof() calls at their own pace.
What do you think?
Have a lovely day!
Alex
> > > require a type name, and a variable is rejected. We implement that with
> > > the typeas() macro:
> > >
> > > #define typeas(T) typeof((T){0})
> > >
> > > This macro works exactly like typeof(), but it requires that the input
> > > is also a type. Passing a variable is a syntax error. We implement
> > > malloc_T() with it:
> > >
> > > // malloc_T - malloc type-safe
> > > #define malloc_T_(n, T) \
> > > ({ \
> > > (typeas(T) *){reallocarray(n, sizeof(T))}; \
> > > })
> > >
> > > which is used as (taking some arbitrary examples from shadow-utils):
> > >
> > > lp = xmalloc_T(1, struct link_name);
> > > targs = xmalloc_T(n_args + 3, char *);
> > >
> > > Some reasons for passing a type name instead of a variable are:
> > >
> > > - It allows grepping for all allocations of a given type.
> > > - It adds readability. It's similar to declaring variables with some
> > > explicit type, vs. using 'auto' (__auto_type) everywhere.
> >
> > Sure, and that's why many places were already using type names, and
> > the use of "auto" was even one of Linus's driving examples for why the
> > new API could be very easily used.
>
> Nice.
>
> > > But there's also a safety aspect. Consider we want to allocate an array
> > > of 42 ints. And consider the programmer accidentally swaps arguments.
> > >
> > > int *p = malloc_T(int, 42); // syntax error
> > > int *p = malloc_T(42, int);
> > > vs
> > > int *p = kmalloc_objs(*p, 42);
> > > int *p = kmalloc_objs(42, *p); // Bogus
> > >
> > > The latter is dereferencing an uninitialized pointer. If for some
> > > reason the pointer had a value before this call, you'd be allocating as
> > > many elements as *p says, which would be bogus, and since typeof(42) is
> > > the same as typeof(*p), the return type would be valid, so this would
> > > still compile.
> >
> > Yeah, this is a foot-gun too. I'm open to requiring a type, but it's a
> > significant amount of careful refactoring needed to accomplish it. It
> > was a weakness of the existing API, though it was more "obvious" since
> > it was visually contained by "sizeof()".
>
> Agree. The good part is that the refactor doesn't need to be as
> careful, because incorrect types will result in build errors, and not
> bugs. But yes, it needs significantly more work than what the current
> API required.
>
> Maybe you could diagnose with coccinelle where expressions are used?
> Then people could slowly look at their diagnostics and replace them.
>
>
> Have a lovely night!
> Alex
>
> --
> <https://www.alejandro-colomar.es>
--
<https://www.alejandro-colomar.es>
Attachment:
signature.asc
Description: PGP signature