Re: [PATCH 1/1] kernel/crash_core.c - Add crashkernel=auto for x86 and ARM
From: Kairui Song
Date: Thu Nov 19 2020 - 01:10:28 EST
On Thu, Nov 19, 2020 at 7:29 AM Saeed Mirzamohammadi
<saeed.mirzamohammadi@xxxxxxxxxx> wrote:
>
> This adds crashkernel=auto feature to configure reserved memory for
> vmcore creation to both x86 and ARM platforms based on the total memory
> size.
Thanks for the patch! This is very helpful for distribution makers,
this allows the distro to have better control of the crashkernel size
and ship it with the kernel. The crashkernel value is sensitive to
kernel and driver changes, so shipping it with the kernel makes sense.
And I think crashkernel=auto could be used as an indicator that user
want the kernel to control the crashkernel size, so some further work
could be done to adjust the crashkernel more accordingly. eg. when
memory encryption is enabled, increase the crashkernel value for the
auto estimation, as it's known to consume more crashkernel memory.
There have been a lot of efforts trying to push this in upstream:
https://lkml.org/lkml/2018/5/20/262
https://lkml.org/lkml/2009/8/12/61
Still, it's not yet accepted. it's good to see more people working on this.
But why not make it arch-independent? This crashkernel=auto idea
should simply work with every arch.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: John Donnelly <john.p.donnelly@xxxxxxxxxx>
> Signed-off-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@xxxxxxxxxx>
> ---
> Documentation/admin-guide/kdump/kdump.rst | 5 +++++
> arch/arm64/Kconfig | 26 ++++++++++++++++++++++-
> arch/arm64/configs/defconfig | 1 +
> arch/x86/Kconfig | 26 ++++++++++++++++++++++-
> arch/x86/configs/x86_64_defconfig | 1 +
> kernel/crash_core.c | 20 +++++++++++++++--
> 6 files changed, 75 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index 75a9dd98e76e..f95a2af64f59 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -285,7 +285,12 @@ This would mean:
> 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
> 3) if the RAM size is larger than 2G, then reserve 128M
>
> +Or you can use crashkernel=auto if you have enough memory. The threshold
> +is 1G on x86_64 and arm64. If your system memory is less than the threshold,
> +crashkernel=auto will not reserve memory. The size changes according to
> +the system memory size like below:
>
> + x86_64/arm64: 1G-64G:128M,64G-1T:256M,1T-:512M
>
> Boot into System Kernel
> =======================
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1515f6f153a0..d359dcffa80e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1124,7 +1124,7 @@ comment "Support for PE file signature verification disabled"
> depends on KEXEC_SIG
> depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
>
> -config CRASH_DUMP
> +menuconfig CRASH_DUMP
> bool "Build kdump crash kernel"
> help
> Generate crash dump after being started by kexec. This should
> @@ -1135,6 +1135,30 @@ config CRASH_DUMP
>
> For more details see Documentation/admin-guide/kdump/kdump.rst
>
> +if CRASH_DUMP
> +
> +config CRASH_AUTO_STR
> + string "Memory reserved for crash kernel"
> + depends on CRASH_DUMP
> + default "1G-64G:128M,64G-1T:256M,1T-:512M"
> + help
> + This configures the reserved memory dependent
> + on the value of System RAM. The syntax is:
> + crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
> + range=start-[end]
> +
> + For example:
> + crashkernel=512M-2G:64M,2G-:128M
> +
> + This would mean:
> +
> + 1) if the RAM is smaller than 512M, then don't reserve anything
> + (this is the "rescue" case)
> + 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
> + 3) if the RAM size is larger than 2G, then reserve 128M
> +
> +endif # CRASH_DUMP
> +
> config XEN_DOM0
> def_bool y
> depends on XEN
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 5cfe3cf6f2ac..899ef3b6a78f 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -69,6 +69,7 @@ CONFIG_SECCOMP=y
> CONFIG_KEXEC=y
> CONFIG_KEXEC_FILE=y
> CONFIG_CRASH_DUMP=y
> +# CONFIG_CRASH_AUTO_STR is not set
> CONFIG_XEN=y
> CONFIG_COMPAT=y
> CONFIG_RANDOMIZE_BASE=y
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index f6946b81f74a..bacd17312bb1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2035,7 +2035,7 @@ config KEXEC_BZIMAGE_VERIFY_SIG
> help
> Enable bzImage signature verification support.
>
> -config CRASH_DUMP
> +menuconfig CRASH_DUMP
> bool "kernel crash dumps"
> depends on X86_64 || (X86_32 && HIGHMEM)
> help
> @@ -2049,6 +2049,30 @@ config CRASH_DUMP
> (CONFIG_RELOCATABLE=y).
> For more details see Documentation/admin-guide/kdump/kdump.rst
>
> +if CRASH_DUMP
> +
> +config CRASH_AUTO_STR
> + string "Memory reserved for crash kernel" if X86_64
> + depends on CRASH_DUMP
> + default "1G-64G:128M,64G-1T:256M,1T-:512M"
> + help
> + This configures the reserved memory dependent
> + on the value of System RAM. The syntax is:
> + crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
> + range=start-[end]
> +
> + For example:
> + crashkernel=512M-2G:64M,2G-:128M
> +
> + This would mean:
> +
> + 1) if the RAM is smaller than 512M, then don't reserve anything
> + (this is the "rescue" case)
> + 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
> + 3) if the RAM size is larger than 2G, then reserve 128M
> +
> +endif # CRASH_DUMP
> +
> config KEXEC_JUMP
> bool "kexec jump"
> depends on KEXEC && HIBERNATION
> diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig
> index 9936528e1939..7a87fbecf40b 100644
> --- a/arch/x86/configs/x86_64_defconfig
> +++ b/arch/x86/configs/x86_64_defconfig
> @@ -33,6 +33,7 @@ CONFIG_EFI_MIXED=y
> CONFIG_HZ_1000=y
> CONFIG_KEXEC=y
> CONFIG_CRASH_DUMP=y
> +# CONFIG_CRASH_AUTO_STR is not set
> CONFIG_HIBERNATION=y
> CONFIG_PM_DEBUG=y
> CONFIG_PM_TRACE_RTC=y
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 106e4500fd53..a44cd9cc12c4 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -7,6 +7,7 @@
> #include <linux/crash_core.h>
> #include <linux/utsname.h>
> #include <linux/vmalloc.h>
> +#include <linux/kexec.h>
>
> #include <asm/page.h>
> #include <asm/sections.h>
> @@ -41,6 +42,15 @@ static int __init parse_crashkernel_mem(char *cmdline,
> unsigned long long *crash_base)
> {
> char *cur = cmdline, *tmp;
> + unsigned long long total_mem = system_ram;
> +
> + /*
> + * Firmware sometimes reserves some memory regions for it's own use.
> + * so we get less than actual system memory size.
> + * Workaround this by round up the total size to 128M which is
> + * enough for most test cases.
> + */
> + total_mem = roundup(total_mem, SZ_128M);
I think this rounding may be better moved to the arch specified part
where parse_crashkernel is called?
>
> /* for each entry of the comma-separated list */
> do {
> @@ -85,13 +95,13 @@ static int __init parse_crashkernel_mem(char *cmdline,
> return -EINVAL;
> }
> cur = tmp;
> - if (size >= system_ram) {
> + if (size >= total_mem) {
> pr_warn("crashkernel: invalid size\n");
> return -EINVAL;
> }
>
> /* match ? */
> - if (system_ram >= start && system_ram < end) {
> + if (total_mem >= start && total_mem < end) {
> *crash_size = size;
> break;
> }
> @@ -250,6 +260,12 @@ static int __init __parse_crashkernel(char *cmdline,
> if (suffix)
> return parse_crashkernel_suffix(ck_cmdline, crash_size,
> suffix);
> +#ifdef CONFIG_CRASH_AUTO_STR
> + if (strncmp(ck_cmdline, "auto", 4) == 0) {
> + ck_cmdline = CONFIG_CRASH_AUTO_STR;
> + pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
> + }
> +#endif
> /*
> * if the commandline contains a ':', then that's the extended
> * syntax -- if not, it must be the classic syntax
> --
> 2.18.4
>
--
Best Regards,
Kairui Song