Pnina!We’ve defined in our system that when a process crashes, we call panic().
Pnina Feder <pnina.feder@xxxxxxxxxxxx> writes:
We are creating a vmcore using kexec on a Linux 6.15 RISC-V system andThanks for reporting this!
analyzing it with the crash tool on the host. This workflow used to
work on Linux 6.14 but is now broken in 6.15.
The issue is caused by a change in the kernel:How are you collecting the /proc/vmcore file? A full set of commands would be helpful.
In Linux 6.15, certain memblock sections are now marked as Reserved in
/proc/iomem. The kexec tool excludes all Reserved regions when
generating the vmcore, so these sections are missing from the dump.
To handle crash recovery, we're using kexec with the following command:
kexec -p /Image --initrd=/rootfs.cpio --append "console=${con} earlycon=${earlycon} no4lvl"
To simulate crash, we trigger it using:
sleep 100 & kill -6 $!
This boots into the crash kernel (kdump), where we then copy the /proc/vmcore file back to the host for analysis.
We are currently using crash-utility version 9.0.0 (master).However, the kernel still uses addresses in these regions—for example,Wdym with "IRQ pointers"? Also, what version (sha1) of crash are you using?
for IRQ pointers. Since the crash tool needs access to these memory
areas to function correctly, their exclusion breaks the analysis.
From the crash analysis logs, we observed errors like:
"......
IRQ stack pointer[0] is ffffffd6fbdcc068
crash: read error: kernel virtual address: ffffffd6fbdcc068 type: "IRQ stack pointer"
.....
<read_kdump: addr: ffffffff80edf1cc paddr: 8010df1cc cnt: 4>
<readmem: ffffffd6fbdd6880, KVADDR, "runqueues entry (per_cpu)", 3456, (FOE), 55acf03963e0>
read_kdump: addr: ffffffd6fbdd6880 paddr: 8fbdd6880 cnt: 1920<crash: read error: kernel virtual address: ffffffd6fbdd6880 type: "runqueues entry (per_cpu)"
These failures occur consistently for addresses in the 0xffffffd000000000 region.
Upon inspection, we confirmed that the physical addresses corresponding to those virtual addresses are not present in the vmcore, as they fall under Reserved memory sections.
We tested a patch to kexec-tools that prevents exclusion of the Reserved-memblock section from the vmcore. With this patch, the issue no longer occurs, and crash analysis succeeds.
Note: I suspect the same issue exists on ARM64, as both the signal.c and kexec-tools implementations are similar.
Thanks!
Björn