Re: [RFC PATCH] Memory hotplug support for arm64 platform

From: Florian Fainelli
Date: Wed Mar 29 2017 - 20:40:17 EST


Hi Andrea, Maciej,

On 02/06/2017 03:17 AM, Andrea Reale wrote:
> Hi Scott,
> Hi all,
>
> in reply to the issues that Scott reported last month, myself and Maciej
> investigated further by running quite a number of experiments on the
> physical and virtual environments we have avaialable.
>
> We collected all the results and relevant logs in a Web page at
> https://hotplug-tests.eu-gb.mybluemix.net/ so that anyone interested can
> go there and check all the details.
>
> The tl;dr version is that, in all configuration, we could not reproduce
> what Scott has described as "memory corruption". The only issue we
> encountered happens when the system is booted with a small amount of
> initial memory (e.g., mem=64M) and one tries to hot-add several sections
> of memory in ZONE_MOVABLE; in that case, the process is likely to fail
> when vmemmap tries to allocate chunks of 2^9 consecutive pages to make
> space for the `struct page`s describing the new memory; in fact, it
> seems likely that, in low memory situations, the system cannot find enough
> consecutive pages in ZONE_DMA or ZONE_NORMAL. This condition is not
> dependand on memory hot-plug; in fact, we counter-tested this by writing
> a simple module that just tries to allocate a few chunks of 2^9 pages,
> and we experienced that it fails when the system is booted with low
> memory (sources and logs in the Web page linked above).
>
> @Scott: were your referring to this issue, by any chance, in your
> previous emails? If not, we would really appreciate if you could help us
> reproduce the condition you are experiencing and/or give us a more detail
> of what are the symptoms of the corruption you are referring to.

One question regarding your patch posted here:
https://lkml.org/lkml/2016/12/14/188

While the "hack" that sets/clears NOMAP in order for pfn_valid() to
return false/true when appropriate during __add_pages() definitively
does seem to work to probe the memory section, don't you also hit the
same warning when you try to online that memory section in
pages_correctly_reserved() once you have cleared the NOMAP flag?

NB: I am working on the 4.1 kernel at the moment, but it seems to be
nearly identical in that regard.

>
> We are still running additional tests on other boards and we will update
> the Web page while we get them. If anyone happens to try these patches
> on their system, we warmly invite to send feedback with either
> negative or positive outcomes.

I will definitively give this a try on ARM64 since I need to get it
working there.

Do you mind posting a non-RFC patch?

Thanks!
--
Florian