Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)
From: Michal Hocko
Date: Mon Mar 13 2017 - 11:11:16 EST
Let's add Andi
On Fri 10-03-17 16:53:33, Michal Hocko wrote:
> On Fri 10-03-17 14:58:07, Michal Hocko wrote:
> > This would explain why onlining from the last block actually works but
> > to me this sounds like a completely crappy behavior. All we need to
> > guarantee AFAICS is that Normal and Movable zones do not overlap. I
> > believe there is even no real requirement about ordering of the physical
> > memory in Normal vs. Movable zones as long as they do not overlap. But
> > let's keep it simple for the start and always enforce the current status
> > quo that Normal zone is physically preceeding Movable zone.
> > Can somebody explain why we cannot have a simple rule for Normal vs.
> > Movable which would be:
> > - block [pfn, pfn+block_size] can be Normal if
> > !zone_populated(MOVABLE) || pfn+block_size < ZONE_MOVABLE->zone_start_pfn
> > - block [pfn, pfn+block_size] can be Movable if
> > !zone_populated(NORMAL) || ZONE_NORMAL->zone_end_pfn < pfn
> OK, so while I was playing with this setup some more I probably got why
> this is done this way. All new memblocks are added to the zone Normal
> where they are accounted as spanned but not present. When we do
> online_movable we just cut from the end of the Normal zone and move it
> to Movable zone. This sounds really awkward. What was the reason to go
> this way? Why cannot we simply add those pages to the zone at the online
Answering to myself. So the reason seems to be 9d99aaa31f59 ("[PATCH]
x86_64: Support memory hotadd without sparsemem") which is no longer
bool "Allow for memory hot-add"
depends on SPARSEMEM || X86_64_ACPI_NUMA
depends on ARCH_ENABLE_MEMORY_HOTPLUG
depends on COMPILE_TEST || !KASAN
so it is either SPARSEMEM or X86_64_ACPI_NUMA that would have to be enabled.
prompt "ACPI NUMA detection"
depends on X86_64 && NUMA && ACPI && PCI
But I do not see any way how to enable anything but SPARSEMEM for x86_64
prompt "Memory model"
depends on SELECT_MEMORY_MODEL
default DISCONTIGMEM_MANUAL if ARCH_DISCONTIGMEM_DEFAULT
default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT
ARCH_SPARSEMEM_DEFAULT is 32b only
depends on NUMA && X86_32
and ARCH_SPARSEMEM_DEFAULT is enabeld on 64b. So I guess whatever was
the reason to add this code back in 2006 is not true anymore. So I am
really wondering. Do we absolutely need to assign pages which are not
onlined yet to the ZONE_NORMAL unconditionally? Why cannot we put them
out of any zone and wait for memory online operation to put them where