Re: [PATCH] x86/orc: Don't bail on stack overflow

From: Andy Lutomirski
Date: Sat Nov 25 2017 - 19:16:58 EST


Can you send me whatever config and exact commit hash generated this?
I can try to figure out why it failed.

On Sat, Nov 25, 2017 at 3:13 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Sat, 25 Nov 2017, Andy Lutomirski wrote:
>
>> On Sat, Nov 25, 2017 at 9:28 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>> > If we overflow the stack into a guard page and then try to unwind
>> > it with ORC, it should work perfectly: by construction, there can't
>> > be any meaningful data in the guard page because no writes to the
>> > guard page will have succeeded.
>> >
>> > ORC seems entirely capable of unwinding in this situation, except
>> > that it doesn't even try. Adjust its initial stack check so that
>> > it's willing to try unwinding.
>> >
>> > I tested this by intentionally overflowing the task stack. The
>> > result is an accurate call trace instead of a trace consisting
>> > purely of '?' entries.
>> >
>> > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> > ---
>> >
>> > Hi all-
>> >
>> > Ingo, this would have fixed half the debugging problem you had, I think.
>> > To really nail it, we'd want some kind of magic to annotate the trace
>> > so that page_fault (and async_page_fault) entries show CR2 and error_code.
>> >
>> > Josh, any ideas of how to do that cleanly? We could easily hard-code it
>> > in the OOPS unwinder, I guess.
>>
>> Actually, this does pretty well. We don't get CR2, but, when I added
>> an intentional bug kind of along the lines of the one you debugged,
>> the intermediate page fault successfully dumps all the regs in the
>> stack trace, so we get the faulting instruction *and* the registers.
>> We also get ORIG_RAX, which tells us the error code. We could be
>> fancy and decode that.
>
> It works in general, but for that case it's not much better than before
> vs. the '?' entries.
>
> Thanks,
>
> tglx
>
> [ 2.556065] PANIC: double fault, error_code: 0x0
> [ 2.557116] CPU: 1 PID: 273 Comm: systemd-udevd Not tainted 4.14.0-01256-g03dea81fe9f2-dirty #30
> [ 2.558930] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 2.560133] task: ffff880428dd8000 task.stack: ffffc900025fc000
> [ 2.560729] RIP: 0010:page_fault+0x11/0x60
> [ 2.561122] RSP: 0000:ffffffffff083fc8 EFLAGS: 00010046
> [ 2.561607] RAX: 00000000819d0ac7 RBX: 0000000000000001 RCX: ffffffff819d0ac7
> [ 2.562357] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffffffffff084078
> [ 2.563027] RBP: 000000000000000b R08: 00000000ffffffff R09: 0000000000000040
> [ 2.563726] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000003
> [ 2.564429] R13: 000055719fd7d410 R14: 0000000000000000 R15: 0000000003938700
> [ 2.565104] FS: 00007f9edc0b78c0(0000) GS:ffff88042e440000(0000) knlGS:0000000000000000
> [ 2.565844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2.566396] CR2: ffffffffff083fb8 CR3: 0000000428ec4005 CR4: 00000000001606e0
> [ 2.567097] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2.567761] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2.568451] Call Trace:
> [ 2.568704] <SYSENTER>
> [ 2.568950] ? __do_page_fault+0x4b0/0x4b0
> [ 2.569348] ? page_fault+0x2c/0x60
> [ 2.569680] ? native_iret+0x7/0x7
> [ 2.570019] ? __do_page_fault+0x4b0/0x4b0
> [ 2.570396] ? page_fault+0x2c/0x60
> [ 2.570743] ? call_function_interrupt+0xc0/0xc0
> [ 2.571199] </SYSENTER>
> [ 2.571422] Code: ff e8 34 b7 6a ff e9 9f 02 00 00 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 83 c4 88 f6 84 24 88 00 00 00 03 75 20 <e8> 4a 01 00 00 48 89 e7 48 8b 74 24 78 48 c7 44 24 78 ff ff ff
> [ 2.573192] Kernel panic - not syncing: Machine halted.
> [ 2.573694] CPU: 1 PID: 273 Comm: systemd-udevd Not tainted 4.14.0-01256-g03dea81fe9f2-dirty #30
> [ 2.574528] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 2.575330] Call Trace:
> [ 2.575570] <#DF>
> [ 2.575760] dump_stack+0x46/0x59
> [ 2.576120] panic+0xde/0x223
> [ 2.576405] df_debug+0x29/0x30
> [ 2.576687] do_double_fault+0x9a/0x120
> [ 2.577057] double_fault+0x22/0x30
> [ 2.577376] RIP: 0010:page_fault+0x11/0x60
> [ 2.577775] RSP: 0000:ffffffffff083fc8 EFLAGS: 00010046
> [ 2.578314] RAX: 00000000819d0ac7 RBX: 0000000000000001 RCX: ffffffff819d0ac7
> [ 2.578979] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffffffffff084078
> [ 2.579666] RBP: 000000000000000b R08: 00000000ffffffff R09: 0000000000000040
> [ 2.580334] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000003
> [ 2.581008] R13: 000055719fd7d410 R14: 0000000000000000 R15: 0000000003938700
> [ 2.581684] ? native_iret+0x7/0x7
> [ 2.582007] WARNING: can't dereference iret registers at ffffffffff084048 for ip page_fault+0x11/0x60
> [ 2.582008] </#DF>
> [ 2.583134] <SYSENTER>
> [ 2.583367] ? __do_page_fault+0x4b0/0x4b0
> [ 2.583751] ? page_fault+0x2c/0x60
> [ 2.584127] ? native_iret+0x7/0x7
> [ 2.584466] ? __do_page_fault+0x4b0/0x4b0
> [ 2.584860] ? page_fault+0x2c/0x60
> [ 2.585195] ? call_function_interrupt+0xc0/0xc0
> [ 2.585621] </SYSENTER>
> [ 2.586966] Dumping ftrace buffer:
> [ 2.587254] (ftrace buffer empty)
> [ 2.587534] Kernel Offset: disabled
> [ 2.587814] ---[ end Kernel panic - not syncing: Machine halted.
>