Re: [RFC][PATCH 5/6] x86/topo: Fix SNC topology mess

From: Peter Zijlstra

Date: Fri Feb 27 2026 - 06:59:11 EST


On Fri, Feb 27, 2026 at 01:07:40AM +0800, Chen, Yu C wrote:
> Hi Peter,
>
> On 2/26/2026 6:49 PM, Peter Zijlstra wrote:
> > + int u = __num_nodes_per_package;
>
> Yes, this is much simpler, thanks for the patch!
>
> > + long d = 0;
> > + int x, y;
> > +
> > + /*
> > + * Is this a unit cluster on the trace?
> > + */
> > + if ((i / u) == (j / u))
> > + return node_distance(i, j);
>
> If the number of nodes per package is 3, we assume that
> every 3 consecutive nodes are SNC siblings (on the same
> trace):node0, node1, and node2 are SNC siblings, while
> node3, node4, and node5 form another group of SNC siblings.
>
> I have a curious thought: could it be possible that
> node0, node2, and node4 are SNC siblings, and node1,
> node3, and node5 are another set of SNC siblings instead?

Yes, give a BIOS guy enough bong-hits and this can be.

That said (and knock on wood), I've so far never seen this (and please
people, don't take this as a challenge).

> Then I studied the code a little more, node ids are dynamically
> allocated via the acpi_map_pxm_to_node, so the assignment of node
> ids depends on the order in which each processor affinity structure
> is listed in the SRAT table. For example, suppose CPU0 belongs to
> package0 and CPU1 belongs to package1, but their entries are placed
> consecutively in the SRAT. In this case, the Proximity Domain of
> CPU0 would be mapped to node0 via acpi_map_pxm_to_node, and CPU1’s
> Proximity Domain would be assigned node1. The logic above would
> then treat them as belonging to the same package, even though they
> are physically in different packages. However, I believe such a
> scenario is unlikely to occur in practice in the BIOS and if it
> happens it should be a BIOS bug if I understand correctly.

Just so.

The thing I worried about is getting memory only nodes iterated in
between or something. But as long as the CPU enumeration happens before
the 'other' crud, then the CPU node mappings should be the consecutive
low numbers and it all just works.