Re: [PATCH 3/6] x86/dumpstack: make printk_stack_address() more generally useful

From: Josh Poimboeuf
Date: Wed Aug 31 2016 - 12:53:14 EST


On Wed, Aug 24, 2016 at 02:37:21PM -0400, Linus Torvalds wrote:
> On Wed, Aug 24, 2016 at 2:22 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > I actively disable KASLR on my dev box and feed these hex numbers into
> > addr2line -ie vmlinux to find where in the function we are.
> >
> > Having the option to make %pB generate them works for me.
>
> Yeah, considering that this is the only place this is used, changing
> %pB sounds quite reasonable.
>
> We could perhaps make %pB show the hex numbers and address (so pB
> would expand to "[<hex>] symbolname".if
>
> (a) not randomizing (so the hex numbers _may_ be useful)
>
> (b) kptr_restrict is 0 (so the hex numbers are "safe" in the dmesg)
>
> and fall back to just the symbolic name if either of those aren't true?
>
> And obviously, if KALLSYMS isn't enabled, you always show hex
> numbers.. That's already the case (but we might want to add the "[<>}'
> markers around the hex numbers just to make the user space automation
> we do have work).

Here's a simple patch to skip the kernel text addresses in the stack
dump, based on top of the up-thread patch set.

It doesn't do the '%pB' thing because now in tip there's a non-printk
user of it in /proc/<pid>/stack, for which there are different rules
(print zeros on kptr_restrict vs skipping the number altogether, for
example). And anyway that could maybe be added later.

It also doesn't check for randomization or kptr_restrict, which keeps it
simple and maximizes its "usefulness" ;-)

Thoughts?

---

From: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Subject: [PATCH] x86/dumpstack: don't print kernel text addresses in stack dump

Printing kernel text addresses in stack dumps is of questionable value,
especially now that address randomization is becoming common.

It can be a security issue because it leaks kernel addresses.

It also affects the usefulness of the stack dump. Linus says:

"I actually spend time cleaning up commit messages in logs, because
useless data that isn't actually information (random hex numbers) is
actively detrimental.

It makes commit logs less legible.

It also makes it harder to parse dumps.

It's not useful. That makes it actively bad.

I probably look at more oops reports than most people. I have not
found the hex numbers useful for the last five years, because they are
just randomized crap.

The stack content thing just makes code scroll off the screen etc, for
example."

The only real downside to removing these addresses is that they can be
used to disambiguate duplicate symbol names. However such cases are
rare, and the context of the stack dump should be enough to be able to
figure it out.

Some kernel developers like to use addr2line to convert the addresses to
a file name and line (which only works without randomization). They can
instead use gdb for that purpose:

$ echo "list *driver_probe_device+0x223" |gdb vmlinux |grep "is in"
(gdb) 0xffffffff815b5d83 is in driver_probe_device (/home/jpoimboe/git/linux/drivers/base/dd.c:378).

(But note that when there are duplicate symbol names, that will only
show the first symbol it finds.)

Here's an example of what a stack dump looks like after this change:

kernel BUG at /home/jpoimboe/git/linux/include/linux/scatterlist.h:140!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: ...
CPU: 0 PID: 338 Comm: systemd-udevd Not tainted 4.8.0-rc3+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
task: ffff880078978000 task.stack: ffffc900008f0000
RIP: sg_init_one+0x87/0xa0
RSP: 0018:ffffc900008f3a68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffc900008f3a98 RCX: 0000000000000028
RDX: 0000000000000041 RSI: 0000000000000000 RDI: 00004100008f3a98
RBP: ffffc900008f3a80 R08: 00000002a014a073 R09: ffffc900008f3aa0
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000008
R13: ffffc900008f3aa0 R14: 0000000000000000 R15: 0000000000000001
FS: 00007fa18fc20880(0000) GS:ffff88007d200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa18fc29000 CR3: 00000000799f8000 CR4: 00000000001406f0
Stack:
ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0
ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321
0000000000000002 0000000000000000 0000000000000000 0000000000000000
Call Trace:
__send_control_msg+0x80/0xf0 [virtio_console]
virtcons_probe+0x256/0x400 [virtio_console]
virtio_dev_probe+0x144/0x1e0 [virtio]
driver_probe_device+0x223/0x430
__driver_attach+0xe3/0xf0
? driver_probe_device+0x430/0x430
bus_for_each_dev+0x73/0xc0
driver_attach+0x1e/0x20
bus_add_driver+0x173/0x270
? virtio_cons_early_init+0x1d/0x1d [virtio_console]
driver_register+0x60/0xe0
? virtio_cons_early_init+0x1d/0x1d [virtio_console]
register_virtio_driver+0x20/0x30 [virtio]
init+0x9f/0xfe3 [virtio_console]
do_one_initcall+0x50/0x180
? do_init_module+0x27/0x1f9
? rcu_read_lock_sched_held+0x45/0x80
? kmem_cache_alloc_trace+0x28a/0x2f0
do_init_module+0x5f/0x1f9
load_module+0x270b/0x2c20
? __symbol_put+0x90/0x90
? show_coresize+0x30/0x30
? vfs_read+0x121/0x140
SYSC_finit_module+0xdf/0x110
SyS_finit_module+0xe/0x10
entry_SYSCALL_64_fastpath+0x1f/0xbd
Code: ff 83 e2 03 48 c1 e8 0c 81 e3 ff 0f 00 00 45 89 65 14 48 c1 e0 06 41 89 5d 10 48 01 c8 48 09 d0 5b 49 89 45 08 41 5c 41 5d 5d c3 <0f> 0b 0f 0b 0f 0b 48 8b 0d fc c6 9a 00 eb a6 66 2e 0f 1f 84 00
RIP: sg_init_one+0x87/0xa0 RSP: ffffc900008f3a68

Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
---
arch/x86/include/asm/kdebug.h | 1 -
arch/x86/kernel/dumpstack.c | 18 ++++--------------
arch/x86/kernel/process_32.c | 7 +++----
arch/x86/kernel/process_64.c | 3 +--
arch/x86/mm/fault.c | 3 +--
arch/x86/platform/uv/uv_nmi.c | 4 ++--
6 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/kdebug.h b/arch/x86/include/asm/kdebug.h
index d318811..29a594a 100644
--- a/arch/x86/include/asm/kdebug.h
+++ b/arch/x86/include/asm/kdebug.h
@@ -21,7 +21,6 @@ enum die_val {
DIE_NMIUNKNOWN,
};

-extern void printk_address(unsigned long address);
extern void die(const char *, struct pt_regs *,long);
extern int __must_check __die(const char *, struct pt_regs *, long);
extern void show_stack_regs(struct pt_regs *regs);
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index c6c6c39..a56c04c 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -29,14 +29,7 @@ static void printk_stack_address(unsigned long address, int reliable,
char *log_lvl)
{
touch_nmi_watchdog();
- printk("%s [<%p>] %s%pB\n",
- log_lvl, (void *)address, reliable ? "" : "? ",
- (void *)address);
-}
-
-void printk_address(unsigned long address)
-{
- pr_cont(" [<%p>] %pS\n", (void *)address, (void *)address);
+ printk("%s %s%pB\n", log_lvl, reliable ? "" : "? ", (void *)address);
}

/*
@@ -277,14 +270,11 @@ int __die(const char *str, struct pt_regs *regs, long err)
sp = kernel_stack_pointer(regs);
savesegment(ss, ss);
}
- printk(KERN_EMERG "EIP: [<%08lx>] ", regs->ip);
- print_symbol("%s", regs->ip);
- printk(" SS:ESP %04x:%08lx\n", ss, sp);
+ printk(KERN_EMERG "EIP: %pS SS:ESP: %04x:%08lx\n",
+ (void *)regs->ip, ss, sp);
#else
/* Executive summary in case the oops scrolled away */
- printk(KERN_ALERT "RIP ");
- printk_address(regs->ip);
- printk(" RSP <%016lx>\n", regs->sp);
+ printk(KERN_ALERT "RIP: %pS RSP: %016lx\n", (void *)regs->ip, regs->sp);
#endif
return 0;
}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 404efdf..9f828a9 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -72,10 +72,9 @@ void __show_regs(struct pt_regs *regs, int all)
savesegment(gs, gs);
}

- printk(KERN_DEFAULT "EIP: %04x:[<%08lx>] EFLAGS: %08lx CPU: %d\n",
- (u16)regs->cs, regs->ip, regs->flags,
- smp_processor_id());
- print_symbol("EIP is at %s\n", regs->ip);
+ printk(KERN_DEFAULT "EIP: %pS\n", (void *)regs->ip);
+ printk(KERN_DEFAULT "EFLAGS: %08lx CPU: %d\n", regs->flags,
+ smp_processor_id());

printk(KERN_DEFAULT "EAX: %08lx EBX: %08lx ECX: %08lx EDX: %08lx\n",
regs->ax, regs->bx, regs->cx, regs->dx);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index b812cd0..d250088 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -60,8 +60,7 @@ void __show_regs(struct pt_regs *regs, int all)
unsigned int fsindex, gsindex;
unsigned int ds, cs, es;

- printk(KERN_DEFAULT "RIP: %04lx:[<%016lx>] ", regs->cs & 0xffff, regs->ip);
- printk_address(regs->ip);
+ printk(KERN_DEFAULT "RIP: %pS\n", (void *)regs->ip);
printk(KERN_DEFAULT "RSP: %04lx:%016lx EFLAGS: %08lx\n", regs->ss,
regs->sp, regs->flags);
printk(KERN_DEFAULT "RAX: %016lx RBX: %016lx RCX: %016lx\n",
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index dc80230..0e7b66c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -679,8 +679,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code,
printk(KERN_CONT "paging request");

printk(KERN_CONT " at %p\n", (void *) address);
- printk(KERN_ALERT "IP:");
- printk_address(regs->ip);
+ printk(KERN_ALERT "IP: %pS\n", (void *)regs->ip);

dump_pagetable(address);
}
diff --git a/arch/x86/platform/uv/uv_nmi.c b/arch/x86/platform/uv/uv_nmi.c
index cd5173a..8410e7d 100644
--- a/arch/x86/platform/uv/uv_nmi.c
+++ b/arch/x86/platform/uv/uv_nmi.c
@@ -387,8 +387,8 @@ static void uv_nmi_dump_cpu_ip_hdr(void)
/* Dump Instruction Pointer info */
static void uv_nmi_dump_cpu_ip(int cpu, struct pt_regs *regs)
{
- pr_info("UV: %4d %6d %-32.32s ", cpu, current->pid, current->comm);
- printk_address(regs->ip);
+ pr_info("UV: %4d %6d %-32.32s %pS",
+ cpu, current->pid, current->comm, (void *)regs->ip);
}

/*
--
2.7.4