Re: [PATCH] arm64/mm/hotplug: Warn when memory limit has been reduced

From: Catalin Marinas
Date: Thu Sep 16 2021 - 10:34:37 EST


On Tue, Sep 14, 2021 at 12:21:10PM +0530, Anshuman Khandual wrote:
> If the max memory limit has been reduced with 'mem=' kernel command line
> option, there might be UEFI memory map described memory beyond that limit
> which could be hot removed. This might be problematic for subsequent kexec
> kernel which could just access such removed memory.
>
> Memory offline notifier exists because there is no other way to block the
> removal of boot memory, only the offlining (which isn't actually a problem)
> But with 'mem=', there is no chance to stop such boot memory being offlined
> as it where never in use by the kernel. As 'mem=' is a debug only option on
> arm64 platform, just warn for such a situation and move on.

Just to make sure I understand, is the memory beyond the mem= limit
considered online by the core code and it can be subsequently offlined?
Looking at walk_system_ram_range(), it doesn't seem to care about the
removed memblock ranges. Would such memory beyond the mem= limit need to
have been onlined first?

> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index cfd9deb347c3..7ac39ee876c3 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1627,6 +1627,18 @@ static int __init prevent_bootmem_remove_init(void)
> if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE))
> return ret;
>
> + if (has_mem_limit_reduced()) {
> + /*
> + * Physical memory limit has been reduced via the 'mem=' kernel
> + * command line option. Memory beyond reduced limit could now be
> + * removed and reassigned (guest ?) transparently to the kernel.
> + * This might cause subsequent kexec kernel to crash or at least
> + * corrupt the memory when accessing UEFI memory map enumerated
> + * boot memory which might have been repurposed.
> + */
> + pr_warn("Memory limit reduced, kexec might be problematic\n");
> + }

I'd actually move the warning to hotplug notifier callback rather than
the init function. I'd also make it pr_warn_once().

--
Catalin