Re: [PATCH] riscv: stacktrace: Fix NULL pointer dereference

From: Jisheng Zhang
Date: Thu Jul 08 2021 - 09:17:32 EST


Hi Palmer,

On Sat, 19 Jun 2021 00:13:32 +0800
Jisheng Zhang wrote:

> From: Jisheng Zhang <jszhang@xxxxxxxxxx>
>
> When CONFIG_FRAME_POINTER=y, calling dump_stack() can always trigger
> NULL pointer dereference panic similar as below:
>
> [ 0.396060] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5+ #47
> [ 0.396692] Hardware name: riscv-virtio,qemu (DT)
> [ 0.397176] Call Trace:
> [ 0.398191] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000960
> [ 0.399487] Oops [#1]
> [ 0.399739] Modules linked in:
> [ 0.400135] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5+ #47
> [ 0.400570] Hardware name: riscv-virtio,qemu (DT)
> [ 0.400926] epc : walk_stackframe+0xc4/0xdc
> [ 0.401291] ra : dump_backtrace+0x30/0x38
> [ 0.401630] epc : ffffffff80004922 ra : ffffffff8000496a sp : ffffffe000f3bd00
> [ 0.402115] gp : ffffffff80cfdcb8 tp : ffffffe000f30000 t0 : ffffffff80d0b0cf
> [ 0.402602] t1 : ffffffff80d0b0c0 t2 : 0000000000000000 s0 : ffffffe000f3bd60
> [ 0.403071] s1 : ffffffff808bc2e8 a0 : 0000000000001000 a1 : 0000000000000000
> [ 0.403448] a2 : ffffffff803d7088 a3 : ffffffff808bc2e8 a4 : 6131725dbc24d400
> [ 0.403820] a5 : 0000000000001000 a6 : 0000000000000002 a7 : ffffffffffffffff
> [ 0.404226] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
> [ 0.404634] s5 : ffffffff803d7088 s6 : ffffffff808bc2e8 s7 : ffffffff80630650
> [ 0.405085] s8 : ffffffff80912a80 s9 : 0000000000000008 s10: ffffffff804000fc
> [ 0.405388] s11: 0000000000000000 t3 : 0000000000000043 t4 : ffffffffffffffff
> [ 0.405616] t5 : 000000000000003d t6 : ffffffe000f3baa8
> [ 0.405793] status: 0000000000000100 badaddr: 0000000000000960 cause: 000000000000000d
> [ 0.406135] [<ffffffff80004922>] walk_stackframe+0xc4/0xdc
> [ 0.407032] [<ffffffff8000496a>] dump_backtrace+0x30/0x38
> [ 0.407797] [<ffffffff803d7100>] show_stack+0x40/0x4c
> [ 0.408234] [<ffffffff803d9e5c>] dump_stack+0x90/0xb6
> [ 0.409019] [<ffffffff8040423e>] ptdump_init+0x20/0xc4
> [ 0.409681] [<ffffffff800015b6>] do_one_initcall+0x4c/0x226
> [ 0.410110] [<ffffffff80401094>] kernel_init_freeable+0x1f4/0x258
> [ 0.410562] [<ffffffff803dba88>] kernel_init+0x22/0x148
> [ 0.410959] [<ffffffff800029e2>] ret_from_exception+0x0/0x14
> [ 0.412241] ---[ end trace b2ab92c901b96251 ]---
> [ 0.413099] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> The reason is the task is NULL when we finally call walk_stackframe()
> the NULL is passed from __dump_stack():
>
> |static void __dump_stack(void)
> |{
> | dump_stack_print_info(KERN_DEFAULT);
> | show_stack(NULL, NULL, KERN_DEFAULT);
> |}
>
> Fix this issue by checking "task == NULL" case in walk_stackframe().
>
> Fixes: eac2f3059e02 ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled"
> Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>

What about this patch? This is a fix to an obvious bug: call dump_stack() will
panic.

Thanks
> ---
> arch/riscv/kernel/stacktrace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> index bde85fc53357..7bc8af75933a 100644
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -27,7 +27,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
> fp = frame_pointer(regs);
> sp = user_stack_pointer(regs);
> pc = instruction_pointer(regs);
> - } else if (task == current) {
> + } else if (task == NULL || task == current) {
> fp = (unsigned long)__builtin_frame_address(1);
> sp = (unsigned long)__builtin_frame_address(0);
> pc = (unsigned long)__builtin_return_address(0);