Re: [PATCH part5 0/7] Arrange hotpluggable memory as ZONE_MOVABLE.

From: Tejun Heo
Date: Mon Aug 12 2013 - 10:50:29 EST


Hello,

On Thu, Aug 08, 2013 at 06:16:12PM +0800, Tang Chen wrote:
> [How we do this]
>
> In ACPI, SRAT(System Resource Affinity Table) contains NUMA info. The memory
> affinities in SRAT record every memory range in the system, and also, flags
> specifying if the memory range is hotpluggable.
> (Please refer to ACPI spec 5.0 5.2.16)
>
> With the help of SRAT, we have to do the following two things to achieve our
> goal:
>
> 1. When doing memory hot-add, allow the users arranging hotpluggable as
> ZONE_MOVABLE.
> (This has been done by the MOVABLE_NODE functionality in Linux.)
>
> 2. when the system is booting, prevent bootmem allocator from allocating
> hotpluggable memory for the kernel before the memory initialization
> finishes.
> (This is what we are going to do. See below.)

I think it's in a much better shape than before but there still are a
couple things bothering me.

* Why can't it be opportunistic? It's silly, for example, to fail
boot because ACPI tells the kernel that all memory is hotpluggable
especially as there'd be plenty of memory sitting around doing
nothing and failing to boot is one of the most grave failure mode.
The HOTPLUG flag can be advisory, right? Try to allocate
!hotpluggable memory first, but if that fails, ignore it and
allocate from anywhere, much like the try_nid allocations.

* Similar to the point hpa raised. If this can be made opportunistic,
do we need the strict reordering to discover things earlier?
Shouldn't it be possible to configure memblock to allocate close to
the kernel image until hotplug and numa information is available?
For most sane cases, the memory allocated will be contained in
non-hotpluggable node anyway and in case they aren't hotplug
wouldn't work but the system will boot and function perfectly fine.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/