Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core)

From: Josh Poimboeuf
Date: Fri Jun 24 2016 - 16:51:27 EST


On Fri, Jun 24, 2016 at 03:25:30PM -0500, Josh Poimboeuf wrote:
> On Fri, Jun 24, 2016 at 11:11:47AM -0700, Linus Torvalds wrote:
> > On Fri, Jun 24, 2016 at 10:51 AM, Linus Torvalds
> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > And in particular, the init_task stack initialization initialized it
> > > to the init_thread pointer. Which was definitely deadly.
> > >
> > > Let's see if that was it..
> >
> > No, it's still broken. But it's *less* broken, so here's a new version
> > of the patch that at least gets some of the stack setup right, in my
> > hope that somebody will bother to look at this, and being less broken
> > might mean that somebody sees what else I missed..
>
> I found at least one bug. The changing of task->stack from a "void *" to an
> "unsigned long *":
>
> > - void *stack;
> > + unsigned long *stack;
>
> That subtly changes the pointer arithmetic in do_boot_cpu():
>
>
> idle->thread.sp = (unsigned long) (((struct pt_regs *)
> (THREAD_SIZE + task_stack_page(idle))) - 1);
>
>
> That ends up adding 128k to the stack page bottom instead of 16k.
>
> But fixing that doesn't seem to fix this:
>
> [18446743832.576241] ------------[ cut here ]------------
> [18446743832.576241] WARNING: CPU: 1 PID: 0 at /home/jpoimboe/git/linux/arch/x86/kernel/cpu/common.c:1434 cpu_init+0x34b/0x440
> [18446743832.576241] Modules linked in:
> [18446743832.576241] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc4+ #47
> [18446743832.576241] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
> [18446743832.576241] 0000000000000086 574e5e6c6855ace9 ffff88007c553e88 ffffffff8143cb83
> [18446743832.576241] 0000000000000000 0000000000000000 ffff88007c553ec8 ffffffff810b0e7b
> [18446743832.576241] 0000059a00000000 0000000000000000 0000000000000000 0000000000000000
> [18446743832.576241] Call Trace:
> [18446743832.576241] [<ffffffff8143cb83>] dump_stack+0x85/0xc2
> [18446743832.576241] [<ffffffff810b0e7b>] __warn+0xcb/0xf0
> [18446743832.576241] [<ffffffff810b0fad>] warn_slowpath_null+0x1d/0x20
> [18446743832.576241] [<ffffffff810491bb>] cpu_init+0x34b/0x440
> [18446743832.576241] [<ffffffff8105ab7c>] start_secondary+0x1c/0x1a0
> [18446743832.576241] ---[ end trace 924d57afbaca0720 ]---
>
> So there's at least another bug lurking..

Found another bug:

#define stack_smp_processor_id() \
({ \
struct thread_info *ti; \
__asm__("andq %%rsp,%0; ":"=r" (ti) : "0" (CURRENT_MASK)); \
ti->cpu; \
})

That macro is obviously no longer valid.

That seems to cause the above warning. When trying to boot CPU 1,
cpu_init() calls the above macro which incorrectly returns 0.

--
Josh