Re: [PATCH] KVM: x86: Add Intel CPUID.1F cpuid emulation support

From: Like Xu
Date: Thu Apr 25 2019 - 02:02:45 EST

On 2019/4/25 12:18, Xiaoyao Li wrote:
On Thu, 2019-04-25 at 10:58 +0800, Like Xu wrote:
On 2019/4/24 22:32, Sean Christopherson wrote:
Now that I understand how min() works...

On Mon, Apr 22, 2019 at 02:40:34PM +0800, Like Xu wrote:
Expose Intel V2 Extended Topology Enumeration Leaf to guest only when
host system has multiple software-visible die within each package.

Signed-off-by: Like Xu <like.xu@xxxxxxxxxxxxxxx>
arch/x86/kvm/cpuid.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index fd39516..9fc14f2 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -65,6 +65,16 @@ u64 kvm_supported_xcr0(void)
return xcr0;
+/* We need to check if the host cpu has multi-chip packaging technology.
+static bool kvm_supported_intel_mcp(void)
+ u32 eax, ignored;
+ cpuid_count(0x1f, 0, &eax, &ignored, &ignored, &ignored);

This is broken because of how CPUID works for unsupported input leafs:

If a value entered for CPUID.EAX is higher than the maximum input value
for basic or extended function for that processor then the data for the
highest basic information leaf is returned.

For example, my system with a max basic leaf of 0x16 returns 0x00000e74

You're right and the cpuid.1f.eax check is unreliable after I checked a
few machines.

+ return boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && (eax != 0);

Checking 'eax != 0' is broken as it will be '0' when SMT is disabled. ecx
is the obvious choice since bits 15:8 are guaranteed to be non-zero when
the leaf is valid.

I agree with this and ecx[15:8] makes sense.

I think we can skip the vendor check. AFAIK, CPUID.1F isn't used by AMD,
and since AMD and Intel try to maintain a semblance of CPUID compatibility
it seems more likely that AMD/Hygon would implement CPUID.1F as-is rather
than repurpose it to mean something else entirely.

If it's true, let's skip the vendor check.

// I have to mention that AMD already has MCP CPUs.

#define F(x) bit(X86_FEATURE_##x)
int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -426,6 +436,7 @@ static inline int __do_cpuid_ent(struct
kvm_cpuid_entry2 *entry, u32 function,
switch (function) {
case 0:
entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 :
+ entry->eax = kvm_supported_intel_mcp() ? 0x1f : entry->eax;

If we put everything together, I think the code can be reduced to:

/* comment about multi-chip leaf... */
if (entry->eax >= 0x1f && cpuid_ecx(0x1f))
entry->eax = 0x1f;
entry->eax = min(entry->eax,
(u32)(f_intel_pt ? 0x14 : 0xd));

Based on:

ECX Bits 07 - 00: Level number. Same value in ECX input.
Bits 15 - 08: Level type.
Bits 31 - 16: Reserved.

how about using an increasing order:

entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 : 0xd));

// ... more checks when eax is between 0x14 and 0x1f if any

/* Check if the host cpu has multi-chip packaging technology.*/
if (((cpuid_ecx(0x1f) >> 8) & 0xff) != 0)
entry->eax = 0x1f;

As Sean pointed out, you cannot rely on the output of cpuid.1f to indicate the
existence of leaf 1f. If maximum basic leaf supported is smaller than 1f, the
data returned by cpuid_ecx(0x1f) is the actual highest basic information leaf of
the hardware.

I don't think so.

So using "entry->eax >= 0x1f" from cpuid.0H is and only is the right way to
check the existence of leaf 1f.

We can simply use (cpuid_ecx(0x1f) & 0x0000ff00) to avoid the unnecessory
shifting operation.

I borrowed this "unnecessory" shifting operation from host check_extended_topology_leaf() and we may do better on this.

Besides, the problem of simply using cpuid_exc(0x1f) in Sean's codes is that we
cannot assmue the reserved bits 31:16 of ECX is always 0 for the future

It's true cause the statement in public spec is not "Reserved = 0" but "Bits 31 - 16: Reserved".

In my opinion, Sean's codes is OK and much simple and clear.

If the host cpuid.0.eax is greater than 0x1f but actually it doesn't
have multi-chip packaging technology and we may want to expose entry->eax to some value smaller than 0x1f but greater than 0x14, much effort needs to apply on Sean's code.

My improvement is good to overwrite cpuid.0.eax in future usage
from the perspective of kvm feature setting not just from value check.

All need to do is using (cpuid_ecx(0x1f) & 0x0000ff00) to verify the leaf.1f is

// ... more checks when eax greater than 0x1f if any

are we OK with it?

case 1:
entry->edx &= kvm_cpuid_1_edx_x86_features;
@@ -544,6 +555,8 @@ static inline int __do_cpuid_ent(struct
kvm_cpuid_entry2 *entry, u32 function,
entry->edx = edx.full;
+ /* function 0x1f has additional index. */
+ case 0x1f:
/* function 0xb has additional index. */
case 0xb: {
int i, level_type;