On Mon, May 05, 2025 at 09:38:43AM +0200, David Hildenbrand wrote:
On 05.05.25 09:28, Oscar Salvador wrote:
On Mon, May 05, 2025 at 09:16:48AM +0200, David Hildenbrand wrote:
memory hotplug code never calls register_one_node(), unless I am missing
something.
During add_memory_resource(), we call __try_online_node(nid, false), meaning
we skip register_one_node().
The only caller of __try_online_node(nid, true) is try_online_node(), called
from CPU hotplug code, and I *guess* that is not required.
Well, I guess this is because we need to link the cpus to the node.
register_one_node() has two jobs: 1) register cpus belonging to the node
and 2) register memory-blocks belonging to the node (if any).
Ah, via __register_one_node() ...
I would assume that an offline node
(1) has no memory
(2) has no CPUs
When we *hotplug* either memory or CPUs, and we first online the node, there
is nothing to register. Because if there would be something, the node would
already be online.
In particular, try_offline_node() will only offline a node if
(A) No present pages: No pages are spanned anymore. This includes
offline memory blocks.
(B) No present CPUs.
But maybe there is some case that I am missing ...
I actually hoped you and Oscar know how that stuff works :)
I tried to figure what is going on there and it all looks really convoluted.
So, on boot we have
cpu_up() ->
try_online_node() ->
bails out because all nodes are online (at least on
x86 AFAIU, see 1ca75fa7f19d ("arch/x86/mm/numa: Do
not initialize nodes twice"))
node_dev_init()i ->
register_one_node() ->
this one can use __register_one_node() and loop
over memblock regions.
And for the hotplug/unplug path, it seems that
register_memory_blocks_under_node(MEMINIT_EARLY) is superfluous, because if
a node had memory it wouldn't get offlined, and if we are hotplugging an
node with memory and cpus, memory hotplug anyway calls
register_memory_blocks_under_node_hotplug().
So, IMHO, register_one_node() should not call
register_memory_blocks_under_node() at all, but again, I might have missed
something :)