Re: [PATCH] x86/syscall: Avoid memcpy() for ia32 syscall_get_arguments()

From: Kees Cook
Date: Fri Jul 12 2024 - 13:55:25 EST


On Fri, Jul 12, 2024 at 11:00:08AM +0200, Peter Zijlstra wrote:
> On Thu, Jul 11, 2024 at 04:10:43PM -0700, Kees Cook wrote:
>
> > The long answer is long, and comes in two halves: the language half and
> > the fortify half.
> >
> > First, the C standard requires that all function argument arrays be
> > decayed to pointers, so your prototype is semantically handled as if
> > it were:
> >
> > void syscall_get_arguments(struct pt_regs *regs, unsigned long *args)
> >
> > The "6" is just totally thrown away by the language. :(
>
> Bah. I mean, I understand why, they *are* pointers after all. But to
> then also discard the object size is just silly. I mean, traditionally C
> doesn't give a crap about object size -- it is irrelevant.
>
> But with alias analysis (which we switch off) and doubly so with this
> fortify stuff, you do care about it.
>
> So why throw it out?

Yes, it boggles the mind. And for a language that doesn't care about
size, it sure cares about size in weird places.

> > And the language is so busted about this that there is actually a
> > diagnostic for "don't do that" that shows up with this code:
> >
> > <source>: In function 'syscall_get_arguments':
> > <source>:53:22: warning: 'sizeof' on array function parameter 'args' will return size of 'long unsigned int *' [-Wsizeof-array-argument]
> > 53 | report(sizeof(args));
> > | ^
>
> :-( This just doesn't make sense. How can you be so inconsistent about
> things.
>
> What will actually break if you 'fix' this? Given that inlining (see
> below) changes the rules willy nilly, I feel we can (and should!) just
> fix this.

I'm not sure -- I have kind of given up on "standard" C helping with any
of this. I look to consistent language extensions now, and where there
isn't any, we've been trying to create them. :P And we're not alone:
Apple's -fbounds-safety stuff[1] looks good too, and overlaps with what
we were already designing with the "counted_by" attribute:
https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/
(We borrowed the "counted_by" name, which is better than what we were
calling it: "element_count".)

> > Does report the expected things for _bdos internally (48), but not for
> > sizeof (8) nor _bos (SIZE_MAX). Of course if we inline it, _bos starts
> > working and, along with _bdos, realizes it was lied to, and reports
> > 32 again.
>
> WTF ?!?! How can all this be so inconsistent and why are people okay
> with that?

This. A thousands times, this. I'm really not okay with it, and we've
been working to get rid of every ambiguity we trip over. It's made sane
bounds checking in Linux extremely hard to get right.

For more fun with array bounds, the one that absolutely floored me was
the exception over trailing arrays:

struct middle_t {
u8 array[6];
int foo;
} *middle;

__builtin_object_size(middle->array, 1) == 6

struct trailing_t {
int foo;
u8 array[6];
} *trailing;

__builtin_object_size(trailing->array, 1) == SIZE_MAX ("unknown")

Because of all the fake flexible array abuses over decades, _bos/_bdos
were forced to _ignore_ the size of an array if it was the last member
of a struct. But after C99 flexible arrays were introduced (24 years
ago), nobody cleaned this up! We had to go get the compilers to create
-fstrict-flex-arrays=3 so that all those weird behaviors would go away:
https://git.kernel.org/linus/df8fc4e934c12b906d08050d7779f292b9c5c6b5

> > For the patch in this thread, the W=1 case was reported (overread of
> > "bx"), because normally fortify would just ignore the overread of
> > the source.
>
> So I'm not entirely sure I agree with that argument. Yes, &regs->bx is
> 'unsigned long *' and sizeof(unsigned long) is 8 (if we assume 64bit).
> However, you can also read it as the point of pt_regs where bx sits --
> which is a far more sensible interpretation IMO.
>
> Because then we're looking at struct pt_regs and an offset therein.

Right -- the way to make this unambiguous has been to make sure there
is an addressable object which contains the elements in question. For
the least disruption, the best we were able to do is introduce the
struct_group() helper. It's internally ugly, but it works.

> So really pt_regs *is* an array of unsigned long, and I feel it is
> really unfortunate we cannot express this in a way that is more concise.

A way to do this would be:

struct pt_regs {
union {
struct {
unsigned long bx;
unsigned long cx;
unsigned long dx;
unsigned long si;
unsigned long di;
unsigned long bp;
};
unsigned long syscall_regs[6];
};
unsigned long ax;
...
};

Now regs->syscall_regs is addressable, sized, etc. "bx" means just bx,
and "syscall_regs" means all 6.

I wrote up a bunch of notes about much of this horror last year here:
https://people.kernel.org/kees/bounded-flexible-arrays-in-c

-Kees

--
Kees Cook