Re: [PATCH 2/3] x86/topology: Fix AMD core count

From: Thomas Gleixner
Date: Sat Mar 19 2016 - 05:26:41 EST


On Fri, 18 Mar 2016, Peter Zijlstra wrote:
> It turns out AMD gets x86_max_cores wrong when there are compute
> units.
>
> The issue is that Linux assumes:
>
> nr_logical_cpus = nr_cores * nr_siblings
>
> But AMD reports its CU unit as 2 cores, but then sets num_smp_siblings
> to 2 as well.
>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Andreas Herrmann <aherrmann@xxxxxxxx>
> Reported-by: Xiong Zhou <jencce.kernel@xxxxxxxxx>
> Fixes: 1f12e32f4cd5 ("x86/topology: Create logical package id")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Link: http://lkml.kernel.org/r/20160317095220.GO6344@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> ---
> arch/x86/kernel/cpu/amd.c | 8 ++++----
> arch/x86/kernel/smpboot.c | 11 ++++++-----
> 2 files changed, 10 insertions(+), 9 deletions(-)
>
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -313,9 +313,9 @@ static void amd_get_topology(struct cpui
> node_id = ecx & 7;
>
> /* get compute unit information */
> - smp_num_siblings = ((ebx >> 8) & 3) + 1;
> + cores_per_cu = smp_num_siblings = ((ebx >> 8) & 3) + 1;
> + c->x86_max_cores /= smp_num_siblings;

Unfortunately that will break stuff in event/amd/core.c, ras/mce_amd_inj.c
which rely on the AMD interpretation of c->x86_max_cores.

I'm seriously grumpy about pointless inconsistent representations depending on
the CPU vendor. That's just lazy and sloppy hackery from the "works for me"
departement.

For now we'll get away with Peters workaround for the Intel HT mess, which is
a different kind of mental insanity invented by HW/BIOS people, but this
topology stuff needs to be made consistent ASAP.

Thanks,

tglx