Re: [PATCH v4 2/2] x86/amd: Fixup cpu_core_id for family17h downcore configuration

From: Suravee Suthikulpanit
Date: Tue Jul 25 2017 - 01:52:13 EST


Boris,

On 7/24/17 21:44, Borislav Petkov wrote:
On Mon, Jul 24, 2017 at 09:14:18PM +0700, Suravee Suthikulpanit wrote:
Actually, this is not totally accurate. My apology. This patch is
mainly fix to incorrect core ID in /proc/cpuinfo.

So you're "fixing" only some numbering thing. Because core_id doesn't
have any influence on anything. Here's on an Intel box I have here:

processor : 0 physical id : 0 core id : 0
processor : 1 physical id : 1 core id : 0
processor : 2 physical id : 2 core id : 0
processor : 3 physical id : 3 core id : 0
processor : 4 physical id : 0 core id : 8
processor : 5 physical id : 1 core id : 8
processor : 6 physical id : 2 core id : 8
processor : 7 physical id : 3 core id : 8
processor : 8 physical id : 0 core id : 2
processor : 9 physical id : 1 core id : 2
processor : 10 physical id : 2 core id : 2
processor : 11 physical id : 3 core id : 2
processor : 12 physical id : 0 core id : 10
processor : 13 physical id : 1 core id : 10
processor : 14 physical id : 2 core id : 10
processor : 15 physical id : 3 core id : 10

[....]

So those core id numbers can be anything as long as the cpumasks used by
the scheduler are correct.

Ok. Sure, it doesn't need be contiguous. But at least the cpu_core_id should represent an ID that make some sense since it is used in the arch/x86/kernel/smpboot.c: match_smt() and some other places. So, if it's invalid for the downcore configuration (i.e. duplicated where it should not be), we should at least clean this up.

This is due to the cpu_core_id fixup in amd_get_topology() below:

/* fixup multi-node processor information */
if (nodes_per_socket > 1) {
u32 cus_per_node;

set_cpu_cap(c, X86_FEATURE_AMD_DCM);
cus_per_node = c->x86_max_cores / nodes_per_socket;

/* core id has to be in the [0 .. cores_per_node - 1] range */
c->cpu_core_id %= cus_per_node;
}

AFAICT, Andreas did this for MC at the time:

4a376ec3a259 ("x86: Fix CPU llc_shared_map information for AMD Magny-Cours")

but I don't think we need to care about core_ids fitting into the node
range anymore. For the above reason - topology doesn't use core ids.

Agree to the point that it does not need to be fitting into the node range.

So you can just as well let ->cpu_core_id be derived from the
->initial_apicid as it is being done now in amd_detect_cmp().

Actually, for family17h, this is from the CPUID_Fn8000001E_EBX[CoreId]. But I get your point.

In order not to cause any more confusion, you can limit the above fixup
to anything below F17h so that we don't upset existing users and add a
big fat comment as to why we're doing this. But if it is only a silly
numbering thing, I don't see the need for doing that jumping through
hoops.


I will update the patch to only limit the fixup to pre-family17h.

Thanks,
Suravee