RE: [2/8,v3] NUMA Hotplug Emulator: infrastructure of NUMA hotplugemulation

From: Li, Haicheng
Date: Sun Nov 21 2010 - 10:14:59 EST


David Rientjes wrote:
> On Fri, 19 Nov 2010, Shaohui Zheng wrote:
>
>> nr_node_ids is the possible node number. when we do regular memory
>> online, it is oline to a possible node, and it is already counted in
>> to nr_node_ids.
>>
>> if you increment nr_node_ids dynamically when node online, it causes
>> a lot of problems. Many data are initialized according to
>> nr_node_ids. That is our experience when we debug the emulator.
>>
>
> I think what we'll end up wanting to do is something like this, which
> adds
> a numa=possible=<N> parameter for x86; this will add an additional N
> possible nodes to node_possible_map that we can use to online later.
> It
> also adds a new /sys/devices/system/memory/add_node file which takes a
> typical "size@start" value to hot-add an emulated node. For example,
> using "mem=2G numa=possible=1" on the command line and doing
> echo 128M@0x80000000" > /sys/devices/system/memory/add_node would
> hot-add
> a node of 128M.
>
> Comments?

Sorry for the late response as I'm in a biz trip recently.

David, your original concern is just about powerful/flexibility. I'm sure our implementation can better meets such requirments.

IMHO, I don't see any powerful/flexibility from your patch, compared to our original implementation. you just make things more complex and mess.

Why not use "numa=hide=N*size" as originally implemented?
- later you just need to online the node once you want. And it naturally/exactly emulates the behavior that current HW provides.
- N is the possible node number. And we can use 128M as the default size for each hidden node if user doesn't specify a size.
- If user wants more mem for hidden node, he just needs specify the "size".
- besides, user can also use "mem=" to hide more mem and later use mem-add i/f to freely attach more mem to the hidden node during runtime.

Your patch introduces additional dependency on "mem=", but ours is simple and flexibly compatible with "mem=" and "numa=emu".


-haicheng--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/