[crash] PANIC: double fault, error_code: 0x0

From: Ingo Molnar
Date: Fri Nov 24 2017 - 15:23:12 EST



* Ingo Molnar <mingo@xxxxxxxxxx> wrote:

> This is a repost of the latest entry-stack plus Kaiser bits from Andy Lutomirski
> (v3 series from today) and Dave Hansen (kaiser-414-tipwip-20171123 version),
> on top of latest tip:x86/urgent (12a78d43de76).
>
> This version is pretty well tested, at least on the usual x86 tree test systems.
> It has a couple of merge mistakes fixed, the biggest difference is in patch #22:
>
> x86/mm/kaiser: Prepare assembly for entry/exit CR3 switching
>
> The other patches are identical or very close to what I posted earlier today.

Here's a new bug, on a testsystem I get the double fault boot crash attached
below. The same bzImage crashes on other systems as well, so it's not CPU
dependent.

Via Kconfig-bisection I have narrowed it down to the following .config detail:
it's triggered by _disabling_ CONFIG_DEBUG_ENTRY and enabling CONFIG_KAISER=y.

I.e. one of the sanity checks of CONFIG_DEBUG_ENTRY has some positive side effect.
I'll try to track down which one it is - any ideas meanwhile?

Thanks,

Ingo

[ 8.797733] calling pt_dump_init+0x0/0x3b @ 1
[ 8.803144] initcall pt_dump_init+0x0/0x3b returned 0 after 1 usecs
[ 8.810589] calling aes_init+0x0/0x11 @ 1
[ 8.815757] initcall aes_init+0x0/0x11 returned 0 after 141 usecs
[ 8.823020] calling ghash_pclmulqdqni_mod_init+0x0/0x54 @ 1
[ 8.831002] PANIC: double fault, error_code: 0x0
[ 8.831002] CPU: 11 PID: 260 Comm: modprobe Not tainted 4.14.0-01419-g1b46550a680d-dirty #17
[ 8.831002] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 8.831002] task: ffff880828ba8000 task.stack: ffffc90004444000
[ 8.831002] RIP: 0010:page_fault+0x11/0x60
[ 8.831002] RSP: 0000:ffffffffff0e7fc8 EFLAGS: 00010046
[ 8.831002] RAX: 00000000819d4d77 RBX: 0000000000000001 RCX: ffffffff819d4d77
[ 8.831002] RDX: 0000000000000003 RSI: 0000000000000010 RDI: ffffffffff0e8078
[ 8.831002] RBP: 0000000000000000 R08: 00007ffd7f1aa530 R09: 00007f9407f98400
[ 8.831002] R10: 0000000000000007 R11: 0000000000000000 R12: 00007ffd7f1aa680
[ 8.831002] R13: 00007f9407f91f80 R14: 0000000000000007 R15: 0000000000000000
[ 8.831002] FS: 00007f9407f8f700(0000) GS:ffff88082e640000(0000) knlGS:0000000000000000
[ 8.831002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.831002] CR2: ffffffffff0e7fb8 CR3: 0000000828bc4000 CR4: 00000000001406e0
[ 8.831002] Call Trace:
[ 8.831002] <SYSENTER>
[ 8.831002] ? __do_page_fault+0x4c0/0x4c0
[ 8.831002] ? page_fault+0x2c/0x60
[ 8.831002] ? native_iret+0x7/0x7
[ 8.831002] ? __do_page_fault+0x4c0/0x4c0
[ 8.831002] ? page_fault+0x2c/0x60
[ 8.831002] ? __entry_text_end+0x1/0x1
[ 8.831002] </SYSENTER>
[ 8.831002] Code: ff e8 a4 75 6a ff e9 9f 02 00 00 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 83 c4 88 f6 84 24 88 00 00 00 03 75 20 <e8> 4a 01 00 00 48 89 e7 48 8b 74 24 78 48 c7 44 24 78 ff ff ff
[ 8.831002] Kernel panic - not syncing: Machine halted.
[ 8.831002] CPU: 11 PID: 260 Comm: modprobe Not tainted 4.14.0-01419-g1b46550a680d-dirty #17
[ 8.831002] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 8.831002] Call Trace:
[ 8.831002] <#DF>
[ 8.831002] dump_stack+0x46/0x62
[ 8.831002] panic+0xde/0x221
[ 8.831002] df_debug+0x29/0x30
[ 8.831002] do_double_fault+0x8f/0x120
[ 8.831002] double_fault+0x22/0x30
[ 8.831002] RIP: 0010:page_fault+0x11/0x60
[ 8.831002] RSP: 0000:ffffffffff0e7fc8 EFLAGS: 00010046
[ 8.831002] RAX: 00000000819d4d77 RBX: 0000000000000001 RCX: ffffffff819d4d77
[ 8.831002] RDX: 0000000000000003 RSI: 0000000000000010 RDI: ffffffffff0e8078
[ 8.831002] RBP: 0000000000000000 R08: 00007ffd7f1aa530 R09: 00007f9407f98400
[ 8.831002] R10: 0000000000000007 R11: 0000000000000000 R12: 00007ffd7f1aa680
[ 8.831002] R13: 00007f9407f91f80 R14: 0000000000000007 R15: 0000000000000000
[ 8.831002] ? native_iret+0x7/0x7
[ 8.831002] WARNING: can't dereference iret registers at ffffffffff0e8048 for ip page_fault+0x11/0x60
[ 8.831002] </#DF>
[ 8.831002] <SYSENTER>
[ 8.831002] ? __do_page_fault+0x4c0/0x4c0
[ 8.831002] ? page_fault+0x2c/0x60
[ 8.831002] ? native_iret+0x7/0x7
[ 8.831002] ? __do_page_fault+0x4c0/0x4c0
[ 8.831002] ? page_fault+0x2c/0x60
[ 8.831002] ? __entry_text_end+0x1/0x1
[ 8.831002] </SYSENTER>
[ 8.831002] Kernel Offset: disabled
[ 8.831002] Rebooting in 1 seconds..
[ 8.831002] ACPI MEMORY or I/O RESET_REG.