Re: [PATCH] x86: fix early boot crash on gcc-10

From: Arvind Sankar
Date: Mon Mar 16 2020 - 14:54:27 EST


On Mon, Mar 16, 2020 at 02:20:00PM -0400, Arvind Sankar wrote:
> On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> > On Mon, Mar 16, 2020 at 02:42:34PM +0100, Peter Zijlstra wrote:
> > > Right I know, I looked for it recently :/ But since this is new in 10
> > > and 10 isn't released yet, I figured someone can add the attribute
> > > before it does get released.
> >
> > Yes, that would be a good solution.
> >
> > I looked at what happens briefly after building gcc10 from git and IINM,
> > the function in question - start_secondary() - already gets the stack
> > canary asm glue added so it checks for a stack canary.
> >
> > However, the stack canary value itself gets set later in that same
> > function:
> >
> > /* to prevent fake stack check failure in clock setup */
> > boot_init_stack_canary();
> >
> > so the asm glue which checks for it would need to reload the newly
> > computed canary value (it is 0 before we compute it and thus the
> > mismatch).
> >
> > So having a way to state "do not add stack canary checking to this
> > particular function" would be optimal. And since you already have the
> > "stack_protect" function attribute I figure adding a "no_stack_protect"
> > one should be easy...
> >
> > > > Or of course you could add noinline attribute to whatever got inlined
> > > > and contains some array or addressable variable that whatever
> > > > -fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
> > > > it would never work even in the past I believe.
> > >
> > > I don't think the kernel supports -fstack-protector-all, but I could be
> > > mistaken.
> >
> > The other thing I was thinking was to carve out only that function into
> > a separate compilation unit and disable stack protector only for it.
> >
> > All IMHO of course.
> >
> > Thx.
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette
>
> With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
> they have some patches that affect it) already adds stack canary into
> start_secondary. Not sure why it doesn't panic already with gcc9?
>
> 00000000000008f0 <start_secondary>:
> 8f0: 53 push %rbx
> 8f1: 48 83 ec 10 sub $0x10,%rsp
> 8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
> 8fc: 00 00
> 8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
> 903: 31 c0 xor %eax,%eax
> ...
> a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
> a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
> a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
> a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
> a3f: 00 00
> a41: 75 12 jne a55 <start_secondary+0x165>
> a43: 48 83 c4 10 add $0x10,%rsp
> a47: 5b pop %rbx
> a48: c3 retq
> a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
> a4c: R_X86_64_PC32 debug_idt_descr-0x4
> a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
> a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
> a56: R_X86_64_PLT32 __stack_chk_fail-0x4

Wait a sec, this function calls cpu_startup_entry as the last thing it
does, which should never return, and hence the stack canary value should
never get checked.

How does the canary get checked with the gcc10 code?

boot_init_stack_canary depends on working if called from functions that
don't return. If that doesn't work with gcc10, we need to disable stack
protector for the other callpoints too -- start_kernel in init/main.c
and cpu_bringup_and_idle in arch/x86/xen/smp_pv.c.

/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)