Re: [PATCH v4 1/2] riscv: move sbi_init() earlier before jump_label_init()

From: Anup Patel
Date: Sun Jun 26 2022 - 00:32:38 EST


On Sat, May 21, 2022 at 8:13 PM Jisheng Zhang <jszhang@xxxxxxxxxx> wrote:
>
> We call jump_label_init() in setup_arch() is to use static key
> mechanism earlier, but riscv jump label relies on the sbi functions,
> If we enable static key before sbi_init(), the code path looks like:
> static_branch_enable()
> ..
> arch_jump_label_transform()
> patch_text_nosync()
> flush_icache_range()
> flush_icache_all()
> sbi_remote_fence_i() for CONFIG_RISCV_SBI case
> __sbi_rfence()
>
> Since sbi isn't initialized, so NULL deference! Here is a typical
> panic log:
>
> [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [ 0.000000] Oops [#1]
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0-rc7+ #79
> [ 0.000000] Hardware name: riscv-virtio,qemu (DT)
> [ 0.000000] epc : 0x0
> [ 0.000000] ra : sbi_remote_fence_i+0x1e/0x26
> [ 0.000000] epc : 0000000000000000 ra : ffffffff80005826 sp : ffffffff80c03d50
> [ 0.000000] gp : ffffffff80ca6178 tp : ffffffff80c0ad80 t0 : 6200000000000000
> [ 0.000000] t1 : 0000000000000000 t2 : 62203a6b746e6972 s0 : ffffffff80c03d60
> [ 0.000000] s1 : ffffffff80001af6 a0 : 0000000000000000 a1 : 0000000000000000
> [ 0.000000] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
> [ 0.000000] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000080200
> [ 0.000000] s2 : ffffffff808b3e48 s3 : ffffffff808bf698 s4 : ffffffff80cb2818
> [ 0.000000] s5 : 0000000000000001 s6 : ffffffff80c9c345 s7 : ffffffff80895aa0
> [ 0.000000] s8 : 0000000000000001 s9 : 000000000000007f s10: 0000000000000000
> [ 0.000000] s11: 0000000000000000 t3 : ffffffff80824d08 t4 : 0000000000000022
> [ 0.000000] t5 : 000000000000003d t6 : 0000000000000000
> [ 0.000000] status: 0000000000000100 badaddr: 0000000000000000 cause: 000000000000000c
> [ 0.000000] ---[ end trace 0000000000000000 ]---
> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>
> Fix this issue by moving sbi_init() earlier before jump_label_init()
>
> Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>

We are seeing a similar crash when booting kernel via EDK2 with RNG enabled.

Shell> fs0:\Image root=/dev/vda2 rootwait console=ttyS0
earlycon=uart8250,mmio,0x10000000 initrd=\initramfs.cp
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...
[ 0.000000] Linux version 5.19.0-rc3 (oe-user@oe-host)
(riscv64-unknown-linux-gnu-gcc (Ventana-2022.05.16) 12.1.0, GNU ld
(Ventana-2022.05.16) 2.37.90.20220201) #1 SMP Thu Jun 23 05:33:13 UTC
2022
[ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x81200000
[ 0.000000] earlycon: uart8250 at MMIO 0x0000000010000000 (options '')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 0.000000] efi: EFI v2.70 by EDK II
[ 0.000000] efi: RNG=0xff94fd98 MEMRESERVE=0xfe658f18
[ 0.000000] efi: seeding entropy pool
[ 0.000000] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[ 0.000000] Oops [#1]
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-rc3 #1
[ 0.000000] epc : 0x0
[ 0.000000] ra : sbi_remote_fence_i+0x1e/0x26
[ 0.000000] epc : 0000000000000000 ra : ffffffff800080f8 sp :
ffffffff81203cd0
[ 0.000000] gp : ffffffff812f1d40 tp : ffffffff8120da80 t0 :
0000000000cb8266
[ 0.000000] t1 : 000000006d5e5146 t2 : 0000000058000000 s0 :
ffffffff81203ce0
[ 0.000000] s1 : ffffffff8047586a a0 : 0000000000000000 a1 :
0000000000000000
[ 0.000000] a2 : 0000000000000000 a3 : 0000000000000000 a4 :
0000000000000000
[ 0.000000] a5 : 0000000000000000 a6 : 0000000000000000 a7 :
0000000000000000
[ 0.000000] s2 : ffffffff80dea320 s3 : ffffffff80deabb0 s4 :
ffffffff81353d48
[ 0.000000] s5 : 0000000000000001 s6 : 00000000fffde848 s7 :
0000000000000004
[ 0.000000] s8 : 0000000081021714 s9 : 000000008101e6f0 s10:
00000000fffde780
[ 0.000000] s11: 0000000000000004 t3 : 000000001467a415 t4 :
0000000000000000
[ 0.000000] t5 : 00000000007627e0 t6 : ffffffffbc865574
[ 0.000000] status: 0000000200000100 badaddr: 0000000000000000
cause: 000000000000000c
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill
the idle task! ]---

This patch fixes the above crash as well.

Reviewed-by: Anup Patel <anup@xxxxxxxxxxxxxx>

Thanks,
Anup

> ---
> arch/riscv/kernel/setup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 834eb652a7b9..d150cedeb7e0 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -268,6 +268,7 @@ void __init setup_arch(char **cmdline_p)
> *cmdline_p = boot_command_line;
>
> early_ioremap_setup();
> + sbi_init();
> jump_label_init();
> parse_early_param();
>
> @@ -284,7 +285,6 @@ void __init setup_arch(char **cmdline_p)
> misc_mem_init();
>
> init_resources();
> - sbi_init();
>
> #ifdef CONFIG_KASAN
> kasan_init();
> --
> 2.34.1
>