Re: RISC-V: patched kexec-tools on github for review/testing

From: Nick Kossifidis
Date: Sat Oct 09 2021 - 09:25:18 EST


Στις 2021-10-06 14:10, Alexandre Ghiti έγραψε:

So I followed the instructions here:
https://documentation.suse.com/fr-fr/sles/12-SP3/html/SLES-all/cha-tuning-kexec.html#cha-tuning-kexec-basic-usage,
below the output on an Unmatched board using a vmlinux stored on a sd
card:

ubuntu@ubuntu:~$ sudo sbin/kexec -l vmlinux --append="$(cat
/proc/cmdline)" --initrd=/boot/initrd.img
Warning: No cmdline provided, using append string as cmdline
Warning: No dtb provided, using /sys/firmware/fdt
[ 1813.472671] INFO: task kworker/1:0:988 blocked for more than 120 seconds.
[ 1813.478751] Not tainted 5.15.0-rc1+ #15
[ 1813.483110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
Could not find a free area of memory of 0x3000 bytes...
locate_hole failed

I used the Ubuntu kernel, so this is pretty large:
-rwxrwxr-x 1 ubuntu ubuntu 277M Oct 5 15:47 vmlinux
-rw-r--r-- 1 root root 98M Sep 21 03:25 /boot/initrd.img


ACK, I haven't tested initrd much TBH, I usually don't use an initrd, and when I do it's a small busybox-based rootfs.

Then if I don't load the initrd (I sometimes have the same warning as
above) I can at least kexec the new kernel but it fails to boot:

ubuntu@ubuntu:~$ sudo ./sbin/kexec -e
Warning: No cmdline or append string provided
Warning: No dtb provided, using /sys/firmware/fdt
[...]
[ 0.000000] SBI v0.2 HSM extension detected
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at arch/riscv/kernel/smpboot.c:107!
[ 0.000000] Kernel BUG [#1]
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc1+ #15
[ 0.000000] Hardware name: SiFive HiFive Unmatched A00 (DT)
[ 0.000000] epc : setup_smp+0xcc/0x142
[ 0.000000] ra : setup_smp+0xc4/0x142
[ 0.000000] epc : ffffffff80a04080 ra : ffffffff80a04078 sp :
ffffffff81803ec0
[ 0.000000] gp : ffffffff81a23220 tp : ffffffff81810500 t0 :
ffffffff81a3551f
[ 0.000000] t1 : ffffffffffffffff t2 : 0000000000000000 s0 :
ffffffff81803f00
[ 0.000000] s1 : 0000000000000000 a0 : 0000000000000000 a1 :
0000000000000000
[ 0.000000] a2 : 0000000000000000 a3 : 0000000000000001 a4 :
0000000000000000
[ 0.000000] a5 : ffffffff80c64500 a6 : 0000000000000004 a7 :
000000000000ff00
[ 0.000000] s2 : 0000000000000005 s3 : 0000000000000000 s4 :
ffffffff8118f9a8
[ 0.000000] s5 : 0000000000000007 s6 : ffffffff80c0b790 s7 :
0000000080000200
[ 0.000000] s8 : 0000000000000fff s9 : 0000000081000200 s10:
0000000000000018
[ 0.000000] s11: 000000000000000b t3 : 0000000000ff0000 t4 :
ffffffffffffffff
[ 0.000000] t5 : ffffffff80c0b7a0 t6 : ffffffff81803bd8
[ 0.000000] status: 0000000200000100 badaddr: 0000000000000000
cause: 0000000000000003
[ 0.000000] [<ffffffff80a04080>] setup_smp+0xcc/0x142
[ 0.000000] [<ffffffff80a03d88>] setup_arch+0x56a/0x590
[ 0.000000] [<ffffffff80a00aa2>] start_kernel+0xaa/0xa5c
[ 0.000000] random: get_random_bytes called from
oops_exit+0x44/0x70 with crng_init=0
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill
the idle task! ]---

This reliably fails here.


This looks weird, I'll check it out (we have an unmatched here so I'll try to get my hands on it sometime next week).

Did you try kdump ? Do you get the same error ?

BTW this is what I use for testing most of the time:

For kexec:
kexec -l /mnt/shared/vmlinux --reuse-cmdline
kexec -e

For kdump:
kexec -p /mnt/shared/vmlinux
echo c > /proc/sysrq-trigger

Thanks a lot for your time !

Regards,
Nick