On 2/5/25 5:47 AM, David Hildenbrand wrote:
On 04.02.25 21:41, David Hildenbrand wrote:
On 04.02.25 21:26, Jason Gunthorpe wrote:
On Tue, Feb 04, 2025 at 09:05:47PM +0100, David Hildenbrand wrote:
Fully agreed, this is going into the right direction. Dumping what's
mapped
is a different story. Maybe that dumping logic could simply be
written in C
for the time being?
?
Isn't dumping just a
decode pte -> phys_to_virt() -> for_each_u64(virt) -> printk?
IIUC, the problematic bit is that you might not have a directmap such
that phys_to_virt() would tell you the whole story.
... but it's late and I am confused. For dumping the *page table* that
would not be required, only when dumping mapped page content (and at
this point I am not sure if that is a requirement).
So hopefully Asahi Lina can clarify what the issue was (if there is
any :) ).
Yes, the crash dumper has to dump the mapped page content. In fact, I
don't care about the page tables themselves other than PTE permission
bits, and the page tables alone are not useful for debugging firmware
crashes (and aren't even included in the dump verbatim, at least not the
kernel-allocated ones). The goal of the crash dumper is to take a
complete dump of firmware virtual memory address space (which includes
the kinds of memory I mentioned in [1]). The output is an ELF format
core dump with all memory that the GPU firmware can access, that the
kernel was able to successfully dump (which should be everything except
for MMIO if the bootloader/DT have the right reserved-memory setup).
I *think* phys_to_virt should work just fine for *this* specific use
case on this platform, but I'm not entirely sure. I still want to use
the various pfn check functions before doing that, to exclude ranges
that would definitely not work.