Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon

From: K Prateek Nayak
Date: Tue Aug 19 2025 - 10:29:46 EST


Hello Boris,

On 8/19/2025 5:04 PM, Borislav Petkov wrote:
> Lemme try to make some sense of this because the wild use of names and things
> is making my head spin...
>
> On Mon, Aug 18, 2025 at 06:04:31AM +0000, K Prateek Nayak wrote:
>> When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
>> was noticed with recent kernels:
>>
>> [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200
>>
>> Naveen, Sairaj debugged the cause to commit c749ce393b8f ("x86/cpu: Use
>> common topology code for AMD") where, after the rework, the initial
>> APICID was set using the CPUID leaf 0x8000001e EAX[31:0] as opposed to
>
> That's
>
> CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId)

Small correction here, this is actually,

CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)

>
>> the value from CPUID leaf 0xb EDX[31:0] previously.
>
> That's
>
> CPUID_Fn0000000B_EDX [Extended Topology Enumeration]
> (Core::X86::Cpuid::ExtTopEnumEdx)
>
>> This led us down a rabbit hole of XTOPOEXT vs TOPOEXT support, preferred
>
> What is XTOPOEXT?
>
> CPUID_Fn0000000B_EDX?
>
> Please define all your things properly so that we can have common base when
> reading this text.

Sorry about that! This should actually be "X86_FEATURE_XTOPOLOGY" which
is a synthetic feature set when topology parsing via one of the following
CPUID leaf is successful:

- 0x1f
V2 Extended Topology Enumeration Leaf
(Intel only)

- 0x80000026
CPUID_Fn80000026_E[A,B,C]X_x0[0...3] [Extended CPU Topology]
Core::X86::Cpuid::ExCpuTopologyE[a,b,c]x[0..3]
(AMD only)

- 0xb
CPUID_Fn0000000B_E[A,B,C]X_x0[0..2] [Extended Topology Enumeration]
Core::X86::Cpuid::ExtTopEnumE[a,b,c]x[0..2]
(Both Intel and AMD)

The parsing of the leaves is tried in the same order as above.

>
> TOPOEXT is, I presume:
>
> #define X86_FEATURE_TOPOEXT ( 6*32+22) /* "topoext" Topology extensions CPUID leafs */
>
> Our PPR says:
>
> CPUID_Fn80000001_ECX [Feature Identifiers] (Core::X86::Cpuid::FeatureExtIdEcx)
>
> "22 TopologyExtensions: topology extensions support. Read-only. Reset:
> Fixed,1. 1=Indicates support for Core::X86::Cpuid::CachePropEax0 and
> Core::X86::Cpuid::ExtApicId."
>
> Those leafs are:
>
> CPUID_Fn8000001D_EAX_x00 [Cache Properties (DC)] (Core::X86::Cpuid::CachePropEax0)
>
> DC topology info. Probably not important for this here.
>
> and
>
> CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
>
> the extended APIC ID is there.
>
> How is this APIC ID different from the extended APIC ID in
>
> CPUID_Fn0000000B_EDX [Extended Topology Enumeration] (Core::X86::Cpuid::ExtTopEnumEdx)
>
> ?

On baremetal, they are the same. On QEMU, when we launch a guest with
a topology that contains more than 256 cores on a single socket, QEMU
0s out all the bits in CPUID_Fn8000001E [1] since it fears a collision
in the "CoreId[7:0]" field of
"CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)"

Since
"CPUID_Fn0000000B_EBX_x01 [Extended Topology Enumeration]" and
"LogProcAtThisLevel[15:0]" can describe a domain with up to 2^16 cores,
the Core ID can always be derived correctly from this even when the
number of cores in the guest topology crosses 256.

>
>> order of their parsing, and QEMU nuances like [1] where QEMU 0's out the
>> CPUID leaf 0x8000001e on CPUs where Core ID crosses 255 fearing a
>> Core ID collision in the 8 bit field which leads to the reported FW_BUG.
>
> Is that what the hw does though?

We don't have baremetal systems with more than 256 cores per socket and
when that happens, I believe the expectation from H/W is to just use
CPUID_Fn80000026 leaf or the CPUID_Fn0000000B leaf.

>
> Has this been verified instead of willy nilly clearing CPUID leafs in qemu?
>
>> Following were major observations during the debug which the two
>> patches address respectively:
>>
>> 1. The support for CPUID leaf 0xb is independent of the TOPOEXT feature
>
> Yes, PPR says so.
>
>> and is rather linked to the x2APIC enablement.
>
> Because the SDM says:
>
> "Bits 31-00: x2APIC ID of the current logical processor."
>
> ?

SDM Vol. 3A Sec. 11.12.8 "CPUID Extensions And Topology Enumeration"
reads:

For Intel 64 and IA-32 processors that support x2APIC, a value of 1
reported by CPUID.01H:ECX[21] indicates that the processor supports
x2APIC and the extended topology enumeration leaf (CPUID.0BH).

The extended topology enumeration leaf can be accessed by executing
CPUID with EAX = 0BH. Processors that do not support x2APIC may
support CPUID leaf 0BH. Software can detect the availability of the
extended topology enumeration leaf (0BH) by performing two steps:

1. Check maximum input value for basic CPUID information by executing
CPUID with EAX= 0. If CPUID.0H:EAX is greater than or equal or 11
(0BH), then proceed to next step

2. Check CPUID.EAX=0BH, ECX=0H:EBX is non-zero.

If both of the above conditions are true, extended topology
enumeration leaf is available.

>
> Is our version not containing the x2APIC ID?

We too have the Extended APIC ID in both CPUID_Fn0000000B and
CPUID_Fn8000001E_EAX and they both match on baremetal. The problem is
only for virtualized guest whose topology contains more than 256
cores per socket because of [1]

>
>> On baremetal, this has
>> not been a problem since TOPOEXT support (Fam 0x15 and above)
>> predates the support for CPUID leaf 0xb (Fam 0x17[Zen2] and above)
>> however, in virtualized environment, the support for x2APIC can be
>> enabled independent of topoext where QEMU expects the guest to parse
>> the topology and the APICID from CPUID leaf 0xb.
>
> So we're fixing a qemu bug?
>
> Why isn't qemu force-enabling TOPOEXT support when one requests x2APIC?
>
> My initial reaction: fix qemu.

This is possible, however what should be the right thing for
CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)?

Should QEMU just wrap and start counting the Core Identifiers again
from 0?

Or Should QEMU go ahead and populate just the
CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
fields and continue to zero out EBX and ECX when CoreID > 255?

[1] https://github.com/qemu/qemu/commit/35ac5dfbcaa4b

--
Thanks and Regards,
Prateek