Re: x86-64: Maintain 16-byte stack alignment

From: Josh Poimboeuf
Date: Thu Jan 12 2017 - 23:28:10 EST


On Thu, Jan 12, 2017 at 07:23:18PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 12, 2017 at 7:11 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > On Thu, Jan 12, 2017 at 05:46:55PM -0800, Andy Lutomirski wrote:
> >> On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >> > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote:
> >> >> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds
> >> >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >> >> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >> >> >>
> >> >> >> Just to clarify, I think you're asking if, for versions of gcc which
> >> >> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C
> >> >> >> functions to ensure their stacks are 16-byte aligned.
> >> >> >>
> >> >> >> It's certainly possible, but I don't see how that solves the problem.
> >> >> >> The stack will still be misaligned by entry code. Or am I missing
> >> >> >> something?
> >> >> >
> >> >> > I think the argument is that we *could* try to align things, if we
> >> >> > just had some tool that actually then verified that we aren't missing
> >> >> > anything.
> >> >> >
> >> >> > I'm not entirely happy with checking the generated code, though,
> >> >> > because as Ingo says, you have a 50:50 chance of just getting it right
> >> >> > by mistake. So I'd much rather have some static tool that checks
> >> >> > things at a code level (ie coccinelle or sparse).
> >> >>
> >> >> What I meant was checking the entry code to see if it aligns stack
> >> >> frames, and good luck getting sparse to do that. Hmm, getting 16-byte
> >> >> alignment for real may actually be entirely a lost cause. After all,
> >> >> I think we have some inline functions that do asm volatile ("call
> >> >> ..."), and I don't see any credible way of forcing alignment short of
> >> >> generating an entirely new stack frame and aligning that.
> >> >
> >> > Actually we already found all such cases and fixed them by forcing a new
> >> > stack frame, thanks to objtool. For example, see 55a76b59b5fe.
> >>
> >> What I mean is: what guarantees that the stack is properly aligned for
> >> the subroutine call? gcc promises to set up a stack frame, but does
> >> it promise that rsp will be properly aligned to call a C function?
> >
> > Yes, I did an experiment and you're right. I had naively assumed that
> > all stack frames would be aligned.
>
> Just to check: did you do your experiment with -mpreferred-stack-boundary=4?

Yes, but it's too late for me to be doing hard stuff and I think my
first experiment was bogus. I didn't use all the other kernel-specific
gcc options.

I tried again with all the kernel gcc options, except with
-mpreferred-stack-boundary=4 instead of 3, and actually came up with the
opposite conclusion.

I used the following code:

void otherfunc(void);

static inline void bar(long *f)
{
asm volatile("call otherfunc" : : "m" (f) : );
}

void foo(void)
{
long buf[3] = {0, 0, 0};
bar(buf);
}

The stack frame was always 16-byte aligned regardless of whether the
buf array size was even or odd.

So my half-asleep brain is telling me that my original assumption was
right.

--
Josh