Re: [v2 1/5] arm64: kdump: reserve memory for crash dump kernel

From: AKASHI Takahiro
Date: Mon May 11 2015 - 02:44:47 EST


On 04/24/2015 07:11 PM, Mark Rutland wrote:
On Fri, Apr 24, 2015 at 08:53:04AM +0100, AKASHI Takahiro wrote:
On system kernel, the memory region used by crash dump kernel must be
specified by "crashkernel=X@Y" boot parameter. reserve_crashkernel()
will allocate the region in "System RAM" and reserve it for later use.

On crash dump kernel, memory region information in system kernel is
described in a specific region specified by "elfcorehdr=X@Y" boot parameter.
reserve_elfcorehdr() will set aside the region to avoid data destruction
by the kernel.

Crash dump kernel will access memory regions in system kernel via
copy_oldmem_page(), which reads a page by ioremap'ing it assuming that
such pages are not part of main memory of crash dump kernel.
This is true under non-UEFI environment because kexec-tools modifies
a device tree adding "usablemem" attributes to memory sections.

I'm not sure what you mean by "usablemem" here.

I think I explained it in my previous reply.

Do you just mean that the memory nodes are altered such that they only
cover memory usable by the crash kernel?

Why not _always_ require a command line argument for the crash kernel
that restricts its memory usage to a particular range? That way it
doesn't matter whether we're using UEFI or not.

This is one option, but why does uefi ignore all the memory properties?

Under UEFI, however, this is not true because UEFI remove memory sections
in a device tree and export all the memory regions, even though they belong
to system kernel.

So we should add "mem=X[MG]" boot parameter to limit the memory size and
avoid hitting the following assertion in ioremap():
if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
return NULL;

That looks suspicious. What is being ioremapped at that point?

As explained so far, all the memory regions are exposed to crash dump kernel,
and it recognizes any pages which should belong to the old kernel also as
part of crash kernel's memory. So pfn_valid() returns true.


[...]

@@ -393,6 +398,7 @@ void __init setup_arch(char **cmdline_p)
local_async_enable();

efi_init();
+
arm64_memblock_init();

paging_init();

Nit: unrelated whitespace change.

Ok. Will fix it.

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index ae85da6..ea70d41 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -34,6 +34,8 @@
#include <linux/dma-contiguous.h>
#include <linux/efi.h>
#include <linux/swiotlb.h>
+#include <linux/kexec.h>
+#include <linux/crash_dump.h>

Nit: please keep these ordered.

Yeah, but others "linux/*.h" in this file are already in a random order.


[...]

+ if (memblock_reserve(crash_base, crash_size)) {
+ pr_warn("crashkernel reservation failed - out of memory\n");
+ return;
+ }

If we can remove this memory rather than reserving it, we can limit the
first kernel's ability to accidentally clobber the crash kernel, at the
expense of having to explicitly map/unmap around loading it.

Do you mean that we should remove mmu mapping of crash kernel memory?
Might be a good idea, but it requires modifying kernel/kexec.c.

-Takahiro AKASHI

Mark.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/