Re: [RFC PATCH] x86/topo: Unify srat_detect_node among amd/intel/hygon

From: K Prateek Nayak

Date: Mon Mar 30 2026 - 00:58:19 EST


Hello Nikola,

On 3/29/2026 5:38 PM, Nikola Z. Ivanov wrote:
> This change is provoked by an observed warning after
> commit 717b64d58cff ("x86/topo: Replace x86_has_numa_in_package")
> when faking numa nodes on intel.
>
> For example:
>
> qemu-system-x86_64 \
> -kernel arch/x86/boot/bzImage \
> -append "console=ttyS0 root=/dev/sda debug numa=fake=2" \
> -hda $IMAGES/unstable.img \
> -cpu qemu64,vendor=GenuineIntel \
> -nographic \
> -m 2G \
> -smp 2 \

You can also say:

-smp 2,sockets=2

and that fixes the warning but that is not a valid solution? Why?

>
> Will trigger:
>
> [ 0.066755][ T0] ------------[ cut here ]------------
> [ 0.066755][ T0] WARNING: arch/x86/kernel/smpboot.c:698 at
> set_cpu_sibling_map+0xe41/0x1f90, CPU#1: swapper/1/0
> [ 0.066755][ T0] Call Trace:
> [ 0.066755][ T0] <TASK>
> [ 0.066755][ T0] ap_starting+0x9e/0x140
> [ 0.066755][ T0] ? __pfx_ap_starting+0x10/0x10
> [ 0.066755][ T0] ? fpu__init_cpu_xstate+0x5c/0x320
> [ 0.066755][ T0] start_secondary+0x66/0x110
> [ 0.066755][ T0] common_startup_64+0x13e/0x147
> [ 0.066755][ T0] </TASK>
>
> smpboot.c suggests that the topology is invalid as
> the CPUs are in the same package but different nodes.

To me, that looks like a broken topology from a virtualization use case
and the user can easily go fix their QEMU cmdlline if they care. I'm
pretty sure folks using NUMA emulation in production know what they are
doing.

>
> Fix this by unifying the srat_detect_node function
> among amd/intel/hygon and taking the amd/hygon approach
> of falling back to LLC when SRAT is not detected.

As far as the AMD, Hygon unification goes, I don't mind that but
someone has to confirm if nearby_node() holds for all APICID
distribution on Intel.

> Place the function inside common.c and expose it in topology.h

There is no need to make it visible out of arch/x86/kernel/cpu/
Perhaps arch/x86/kernel/cpu/cpu/cpu.h?

>
> The hygon code is already basically identical to amd
> except for the way it obtains the LLC ID.
> We can reuse that from the hygon code since we
> already have the struct cpuinfo_x86 passed to us.
>
> Signed-off-by: Nikola Z. Ivanov <zlatistiv@xxxxxxxxx>
> ---
> This is marked RFC as I lack the context for the reason
> why the intel code looks the way it does. I can see
> it went through a few changes in the 2008-2010 year range,
> which makes be believe that the comment regarding
> "not doing AMD heuristics for now" is long overdue.

So prior to you patch, If I launch:

-smp 4,sockets=2,cores=2

and "numa=fake=2", the srat_detect_node() for an Intel VM maps:

CPU#0 -> Node#0
CPU#1 -> Node#1
CPU#2 -> Node#0
CPU#3 -> Node#1

Which resembles Intel baremetal node assignments where the CPUs
are interleaved. After your patch, it does:

CPU#0 -> Node#0
CPU#1 -> Node#0
CPU#2 -> Node#0
CPU#3 -> Node#0

So despite there being 2 LLCs, the Node assignments all go to Node#0
which may have other unintended consequences.

The statement "falling back to LLC when SRAT is not detected." isn't
accurate right? We have 2 LLCs and 2 Nodes but topology bits associate
both LLCs to the same node.

I'm all for unifying the AMD and Hygon's srat_detect_node() but unifying
all three for an obviously broken use-case isn't a good motivation.

I'll let others comment since they are more familiar with the NUMA
emulation bits and maybe all this is acceptable.

--
Thanks and Regards,
Prateek