Re: [2.6.20][PATCH] fix mempolicy error check on a system withmemory-less-node

From: KAMEZAWA Hiroyuki
Date: Wed Feb 07 2007 - 10:28:56 EST


On Wed, 7 Feb 2007 06:05:56 -0800 (PST)
Christoph Lameter <clameter@xxxxxxx> wrote:

> On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote:
>
> > > IMHO there shouldn't be any memory less nodes. The architecture code
> > > should not create them. The CPU should be assigned to a nearby node instead.
> > > At least x86-64 ensures that.
> > >
> > AFAIK, ia64 creates nodes just depends on SRAT's possible resource information.
> > Then, ia64 can create cpu-memory-less-node(node with no available resource.).
> > (*)I don't like this.
>
> I think that is only true for !SN2 platforms? Could we fix this?
>
AFAIK, some vendor(HP?) has following configraion
- node0 .... cpu only node
- node1 .... cpu only node
- node2 .... memory only node.
This is because of their memory-interleave technique.

Our 64cpu socket NUMA system also has a config
- node0 cpu+memory node
- node 1 - 7 cpu only node.
for deviding scheduler domain.(old kernel had problem with big-sched-domain)

To fix memory-less-node, we have to test the performance of
"very-big-scheduler-domain" and to define the rule for cpu-hot-add, as
"a new cpu will be added to the most nearby node"
(node-hot-add will have to add some hook..)

I don't know someone who created memory-less-node in past may have some other issues.

There may be some complicated topology system with complicated PXM map.


> > If we don't allow memory-less-node, we may have to add several codes for cpu-hot-add.
> > cpus should be moved to nearby node at hotadd .
> > And node-hot-add have to care that cpus mustn't be added before memory, cpu-driven
> > node-hot-add will never occur. (ACPI's 'container' device spec can't guaranntee this.)
>
> Well you could bring down the cpu and bring it up again? This would also
> assure the best placement of the runtime structures for node?
>
cpu-to-node relationship is fixed in the early stage of cpu hotplug.
I'm not sure we can bring down/up cpu again in clean way. After a cpu is added,
the kernel losts its original PXM value now.

about runtime structures:
The runtime structure placement for a hot-added-node is another issue here.
I and Goto-san have a plan for optimized placement of structures and will
try when we can do. (We are now assgined to RHEL5 stabilization tasks...)

Moving per-cpu-area at hotadd does not look easy.
IMHO, maybe we have to use stop_machine_run() to move it.

Anyway, I'll post an another *easy* patch just for fix the NULL pointer access.
please review.

Thanks,
-Kame






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/