Re: [PATCH 0/1] Fix for riscv vmcore issue

From: Alexandre Ghiti
Date: Fri Jul 04 2025 - 08:26:05 EST

Next message: David Hildenbrand: "Re: [PATCH v2] selftests/mm: pagemap_scan ioctl: add PFN ZERO test cases"
Previous message: Kirill A. Shutemov: "Re: [PATCHv8 04/17] x86/cpu: Defer CR pinning setup until after EFI initialization"
In reply to: Pnina Feder: "RE: [PATCH 0/1] Fix for riscv vmcore issue"
Next in thread: Pnina Feder: "RE: [PATCH 0/1] Fix for riscv vmcore issue"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Pnina,

On 7/3/25 14:06, Pnina Feder wrote:

Pnina!

Pnina Feder <pnina.feder@xxxxxxxxxxxx> writes:

We are creating a vmcore using kexec on a Linux 6.15 RISC-V system and
analyzing it with the crash tool on the host. This workflow used to
work on Linux 6.14 but is now broken in 6.15.

Thanks for reporting this!

The issue is caused by a change in the kernel:
In Linux 6.15, certain memblock sections are now marked as Reserved in
/proc/iomem. The kexec tool excludes all Reserved regions when
generating the vmcore, so these sections are missing from the dump.

How are you collecting the /proc/vmcore file? A full set of commands would be helpful.

We’ve defined in our system that when a process crashes, we call panic().
To handle crash recovery, we're using kexec with the following command:
kexec -p /Image --initrd=/rootfs.cpio --append "console=${con} earlycon=${earlycon} no4lvl"

To simulate crash, we trigger it using:
sleep 100 & kill -6 $!

This boots into the crash kernel (kdump), where we then copy the /proc/vmcore file back to the host for analysis.

However, the kernel still uses addresses in these regions—for example,
for IRQ pointers. Since the crash tool needs access to these memory
areas to function correctly, their exclusion breaks the analysis.

Wdym with "IRQ pointers"? Also, what version (sha1) of crash are you using?

We are currently using crash-utility version 9.0.0 (master).
From the crash analysis logs, we observed errors like:

"......
IRQ stack pointer[0] is ffffffd6fbdcc068
crash: read error: kernel virtual address: ffffffd6fbdcc068 type: "IRQ stack pointer"
.....

<read_kdump: addr: ffffffff80edf1cc paddr: 8010df1cc cnt: 4>
<readmem: ffffffd6fbdd6880, KVADDR, "runqueues entry (per_cpu)", 3456, (FOE), 55acf03963e0>

read_kdump: addr: ffffffd6fbdd6880 paddr: 8fbdd6880 cnt: 1920<

crash: read error: kernel virtual address: ffffffd6fbdd6880 type: "runqueues entry (per_cpu)"

I can't reproduce this issue on qemu, booting with sv39. I'm using the latest kexec-tools (which recently merged riscv support), crash 9.0.0 and kernel 6.16.0-rc4. Note that I'm using crash in qemu.

Are you able to reproduce this on qemu too?

Maybe that's related to the config, can you share your config?

These failures occur consistently for addresses in the 0xffffffd000000000 region.

FYI, this region is the direct mapping (see Documentation/arch/riscv/vm-layout.rst).

Thanks,

Alex

Upon inspection, we confirmed that the physical addresses corresponding to those virtual addresses are not present in the vmcore, as they fall under Reserved memory sections.
We tested a patch to kexec-tools that prevents exclusion of the Reserved-memblock section from the vmcore. With this patch, the issue no longer occurs, and crash analysis succeeds.
Note: I suspect the same issue exists on ARM64, as both the signal.c and kexec-tools implementations are similar.

Thanks!
Björn

Next message: David Hildenbrand: "Re: [PATCH v2] selftests/mm: pagemap_scan ioctl: add PFN ZERO test cases"
Previous message: Kirill A. Shutemov: "Re: [PATCHv8 04/17] x86/cpu: Defer CR pinning setup until after EFI initialization"
In reply to: Pnina Feder: "RE: [PATCH 0/1] Fix for riscv vmcore issue"
Next in thread: Pnina Feder: "RE: [PATCH 0/1] Fix for riscv vmcore issue"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]