Re: [PATCH V9 2/2] arm64/mm: Enable memory hot remove

From: Catalin Marinas
Date: Fri Oct 18 2019 - 05:48:59 EST

On Fri, Oct 11, 2019 at 08:26:32AM +0530, Anshuman Khandual wrote:
> On 10/10/2019 05:04 PM, Catalin Marinas wrote:
> > Mark Rutland mentioned at some point that, as a preparatory patch to
> > this series, we'd need to make sure we don't hot-remove memory already
> > given to the kernel at boot. Any plans here?
> Hmm, this series just enables platform memory hot remove as required from
> generic memory hotplug framework. The path here is triggered either from
> remove_memory() or __remove_memory() which takes physical memory range
> arguments like (nid, start, size) and do the needful. arch_remove_memory()
> should never be required to test given memory range for anything including
> being part of the boot memory.

Assuming arch_remove_memory() doesn't (cannot) check, is there a risk on
arm64 that, for example, one removes memory available at boot and then
kexecs a new kernel? Does the kexec tool present the new kernel with the
original memory map?

I can see x86 has CONFIG_FIRMWARE_MEMMAP suggesting that it is used by
kexec. try_remove_memory() calls firmware_map_remove() so maybe they
solve this problem differently.

Correspondingly, after an arch_add_memory(), do we want a kexec kernel
to access it? x86 seems to use the firmware_map_add_hotplug() mechanism.

Adding James as well for additional comments on kexec scenarios.

> IIUC boot memory added to system with memblock_add() lose all it's identity
> after the system is up and running. In order to reject any attempt to hot
> remove boot memory, platform needs to remember all those memory that came
> early in the boot and then scan through it during arch_remove_memory().
> Ideally, it is the responsibility of [_]remove_memory() callers like ACPI
> driver, DAX etc to make sure they never attempt to hot remove a memory
> range, which never got hot added by them in the first place. Also, unlike
> /sys/devices/system/memory/probe there is no 'unprobe' interface where the
> user can just trigger boot memory removal. Hence, unless there is a bug in
> ACPI, DAX or other callers, there should never be any attempt to hot remove
> boot memory in the first place.

That's fine if these callers give such guarantees. I just want to make
sure someone checked all the possible scenarios for memory hot-remove.