Re: [RFC PATCH] x86/debug: Dump more detailed segfault info

From: Andy Lutomirski
Date: Sun Nov 13 2016 - 11:15:27 EST


On Nov 12, 2016 4:27 AM, "Borislav Petkov" <bp@xxxxxxxxx> wrote:
>
> On Sat, Nov 12, 2016 at 12:06:49PM +0100, Ingo Molnar wrote:
> > Note that on recent kernels, with printk log timestamping enabled, this looks
> > like:
> >
> > [ 206.721243] CR2: 0000000000000000 CR3: 000000042ab75000 CR4: 00000000001406e0
> > [ 206.729217] Code:
> > [ 206.731271] 55
> > [ 206.733046] 48
> > [ 206.733348] 89
> > [ 206.733665] e5
> > [ 206.733982] ff
>
> Hmm, this would then be no different with the "normal" Code: line as
> I simply stole it from there. And I have CONFIG_PRINTK_TIME=y. And it
> looks ok in my guest:
>
> [ 56.005550] strsep[3674]: segfault at 40066b ip 00007ffff7abe22b sp 00007fffffffeb40 error 7 in libc-2.19.so[7ffff7a33000+19f000]
> [ 56.009876] RIP: 0033:[<00007ffff7abe22b>] [<00007ffff7abe22b>] 0x7ffff7abe22b
> [ 56.011917] RSP: 002b:00007fffffffeb40 EFLAGS: 00010202
> [ 56.020225] RAX: 000000000040066b RBX: 0000000000400664 RCX: 0000000000000000
> [ 56.021387] RDX: 0000000000000000 RSI: 000000000000003d RDI: 0000000000400665
> [ 56.022373] RBP: 00007fffffffeb60 R08: 00007ffff7dd7c60 R09: 00007ffff7deae20
> [ 56.023348] R10: 00007fffffffe920 R11: 00007ffff7abe200 R12: 0000000000400460
> [ 56.024467] R13: 00007fffffffec50 R14: 0000000000000000 R15: 0000000000000000
> [ 56.025560] FS: 00007ffff7fdc700(0000) GS:ffff88007ec40000(0000) knlGS:0000000000000000
> [ 56.026665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 56.027458] CR2: 000000000040066b CR3: 000000007aabe000 CR4: 00000000000406e0
> [ 56.028597] Code: 74 33 80 7e 01 00 74 22 48 89 df e8 5a 8a ff ff 48 85 c0 74 20 <c6> 00 00 48 83 c0 01 48 89 45 00 48 89 d8 48 83 c4 08 5b 5d c3 0f b6 13 38 d0 74 29 84 d2 75 15 48 c7 45 00 00 00 00 00 48 83 c4
>
> So, theoretically, show_regs() would generate the same thing on your
> machine. Normal splats look the ok here too:
>
> [ 228.093462] sysrq: SysRq : Trigger a crash
> [ 228.095306] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 228.096955] IP: [<ffffffff81369d5b>] sysrq_handle_crash+0x1b/0x30
> [ 228.096955] PGD 7abc8067 [ 228.096955] PUD 79a26067
> PMD 0 [ 228.096955]
> [ 228.096955] Oops: 0002 [#1] PREEMPT SMP
> [ 228.096955] Modules linked in:
> [ 228.096955] CPU: 3 PID: 3692 Comm: bash Not tainted 4.9.0-rc4+ #32
> [ 228.096955] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
> [ 228.096955] task: ffff88007936c800 task.stack: ffffc90002e38000
> [ 228.096955] RIP: 0010:[<ffffffff81369d5b>] [<ffffffff81369d5b>] sysrq_handle_crash+0x1b/0x30
> [ 228.096955] RSP: 0018:ffffc90002e3bde8 EFLAGS: 00010246
> [ 228.096955] RAX: 0000000000000000 RBX: 0000000000000063 RCX: 0000000000000000
> [ 228.096955] RDX: 0000000000000001 RSI: ffffffff810a3e13 RDI: 0000000000000063
> [ 228.096955] RBP: ffffc90002e3bde8 R08: 0000000000000001 R09: 0000000000000006
> [ 228.096955] R10: 0000000000000001 R11: 000000000000018f R12: 000000000000000a
> [ 228.096955] R13: ffffffff81c569c0 R14: 0000000000000000 R15: 0000000000000000
> [ 228.096955] FS: 00007ffff7fdb700(0000) GS:ffff88007ecc0000(0000) knlGS:0000000000000000
> [ 228.096955] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 228.096955] CR2: 0000000000000000 CR3: 0000000079831000 CR4: 00000000000406e0
> [ 228.096955] Stack:
> [ 228.096955] ffffc90002e3be18 ffffffff8136a233 0000000000000002 fffffffffffffffb
> [ 228.096955] ffff88007a883d00 0000000000705408 ffffc90002e3be30 ffffffff8136a66f
> [ 228.096955] ffff88007b9c6540 ffffc90002e3be50 ffffffff811dbcf2 ffff88007a883d00
> [ 228.096955] Call Trace:
> [ 228.096955] [<ffffffff8136a233>] __handle_sysrq+0x103/0x160
> [ 228.096955] [<ffffffff8136a66f>] write_sysrq_trigger+0x2f/0x40
> [ 228.096955] [<ffffffff811dbcf2>] proc_reg_write+0x42/0x70
> [ 228.096955] [<ffffffff8117abd8>] __vfs_write+0x28/0x120
> [ 228.096955] [<ffffffff8107b5bf>] ? preempt_count_sub+0xaf/0x120
> [ 228.096955] [<ffffffff8107b5bf>] ? preempt_count_sub+0xaf/0x120
> [ 228.096955] [<ffffffff8117e182>] ? __sb_start_write+0x52/0xe0
> [ 228.096955] [<ffffffff8117b930>] vfs_write+0xc0/0x180
> [ 228.096955] [<ffffffff8117cbef>] SyS_write+0x4f/0xb0
> [ 228.096955] [<ffffffff816e5e2e>] entry_SYSCALL_64_fastpath+0x1c/0xac
> [ 228.096955] Code: 6e bf da ff eb e4 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 e8 a2 4a d4 ff c7 05 f0 25 a9 00 01 00 00 00 0f ae f8 <c6> 04 25 00 00 00 00 00 0f 0b 90 66 2e 0f 1f 84 00 00 00 00 00
> [ 228.096955] RIP [<ffffffff81369d5b>] sysrq_handle_crash+0x1b/0x30
> [ 228.096955] RSP <ffffc90002e3bde8>
> [ 228.096955] CR2: 0000000000000000
> [ 228.137948] ---[ end trace cfc5457f348eda2e ]---
> [ 228.138698] Kernel panic - not syncing: Fatal exception
> [ 228.140137] Kernel Offset: disabled
> [ 228.140639] ---[ end Kernel panic - not syncing: Fatal exception
>
> ...
> > So I don't mind the feature, but this should only dump code that is user-readable.
>
> Yeah, this is purely a debug feature so how about I stick it behind a
> switch in debugfs which is root-only and it is disabled by default? When
> you boot, you do:
>
> # echo 1 > /sys/kernel/debug/x86/detailed_segfault

How about dropping the __ in front of the copy?