Re: [PATCH] x86/stacktrace: Do not access user space memory unnecessarily

From: Peter Zijlstra
Date: Tue Jul 02 2019 - 03:28:43 EST


On Tue, Jul 02, 2019 at 02:31:51PM +0900, Eiichi Tsukata wrote:
> Put the boundary check before it accesses user space to prevent unnecessary
> access which might crash the machine.
>
> Especially, ftrace preemptirq/irq_disable event with user stack trace
> option can trigger SEGV in pid 1 which leads to panic.
>
> Reproducer:
>
> CONFIG_PREEMPTIRQ_TRACEPOINTS=y
> # echo 1 > events/preemptirq/enable
> # echo userstacktrace > trace_options
>
> Output:
>
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> CPU: 1 PID: 1 Comm: systemd Not tainted 5.2.0-rc7+ #10

Killing systemd is a feature :-)

> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
> Call Trace:
> dump_stack+0x67/0x90
> panic+0x100/0x2c6
> do_exit.cold+0x4e/0x101
> do_group_exit+0x3a/0xa0
> get_signal+0x14a/0x8e0
> do_signal+0x36/0x650
> exit_to_usermode_loop+0x92/0xb0
> prepare_exit_to_usermode+0x6f/0xb0
> retint_user+0x8/0x18
> RIP: 0033:0x55be7ad1c89f
> Code: Bad RIP value.

^^^ that's weird, no amount of unwinding should affect regs->ip.

> RSP: 002b:00007ffe329a4b00 EFLAGS: 00010202
> RAX: 0000000000000768 RBX: 00007ffe329a4ba0 RCX: 00007ff0063aa469
> RDX: 00007ff0066761de RSI: 00007ffe329a4b20 RDI: 0000000000000768
> RBP: 000000000000000b R08: 0000000000000000 R09: 00007ffe329a4e2f
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000768
> R13: 0000000000000000 R14: 0000000000000004 R15: 000055be7b3d3560
> Kernel Offset: 0x2a000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
> Fixes: 02b67518e2b1 ("tracing: add support for userspace stacktraces in tracing/iter_ctrl")
> Signed-off-by: Eiichi Tsukata <devel@xxxxxxxxxxxx>
> ---
> arch/x86/kernel/stacktrace.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> index 2abf27d7df6b..6d0c608ffe34 100644
> --- a/arch/x86/kernel/stacktrace.c
> +++ b/arch/x86/kernel/stacktrace.c
> @@ -123,12 +123,12 @@ void arch_stack_walk_user(stack_trace_consume_fn consume_entry, void *cookie,
> while (1) {
> struct stack_frame_user frame;
>
> + if ((unsigned long)fp < regs->sp)
> + break;
> frame.next_fp = NULL;
> frame.ret_addr = 0;
> if (!copy_stack_frame(fp, &frame))
> break;
> - if ((unsigned long)fp < regs->sp)
> - break;

Aside of which, that doesn't make sense, even if copy_stack_frame() was
fed utter garbage it should never result in the user process being
affected.

It does: "pagefault_disable(); __copy_from_user_inatomic()", which
should take the fault and catch it in an extable and have it return
-EFAULT.

Something is really fishy here, maybe Josh has an idea?

> if (frame.ret_addr) {
> if (!consume_entry(cookie, frame.ret_addr, false))
> return;
> --
> 2.21.0
>