Re: [PATCH] x86: use near online node instead of round bin for numa

From: Andi Kleen
Date: Mon Oct 05 2009 - 14:51:50 EST


On Mon, Oct 05, 2009 at 11:40:46AM -0700, Yinghai Lu wrote:
> Andi Kleen wrote:
> > On Mon, Oct 05, 2009 at 11:09:59AM -0700, Yinghai Lu wrote:
> >> Andi Kleen wrote:
> >>> Yinghai Lu <yinghai@xxxxxxxxxx> writes:
> >>>
> >>>> cpu to node mapping is set in following sequence:
> >>>> 1. numa_init_array: set up roundbin from cpu to online node
> >>>> 2. init_cpu_to_node: set that according to apicid_to_node[] according to srat
> >>>> only handle that node is online, and leave other cpu on node
> >>>> without ram (aka not online) to still round-bin
> >>>> 3. later srat_detect_node for intel/amd, will use first_online node or near by
> >>>> node.
> >>>>
> >>>> problem is that setup_per_cpu_areas() is called between 2 and 3. the per_cpu
> >>>> for cpu on node with ram is on different node. and could put that on node with
> >>>> two hops away.
> >>>>
> >>>> so try add find_near_online_node() and call int init_cpu_to_node()
> >>> This fallback case should not really happen anyways, unless the BIOS is buggy
> >>> (in this case it might better to completely reject the SRAT because
> >>> more might be wrong).
> >> SRAT is right, and some node has no ram installed.
> >
> > In this case there should be still a PXM to define the CPU locality -- your BIOS is broken.
> > Please fix it there.
>
> I don't think so.

Let's put it like this: your BIOS does not describe the full system
topology which is a severe BIOS bug. Putting hacks into Linux
to work around that is not the right solution.

-Andi

--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/