Re: 4.5.0+ panic when setup loop device

From: Peter Zijlstra
Date: Thu Mar 17 2016 - 05:52:50 EST


On Thu, Mar 17, 2016 at 09:56:05AM +0800, Xiong Zhou wrote:
> On Wed, Mar 16, 2016 at 11:26 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> > Can you please provide a full boot log and the output of 'cat /proc/cpuinfo' ?

Mar 17 17:34:30 myhost kernel: smpboot: Max logical packages: 1
Mar 17 17:34:30 myhost kernel: smpboot: APIC(20) Converting physical 1 to logical package 0
Mar 17 17:34:30 myhost kernel: smpboot: APIC(40) Package 2 exceeds logical package map

So that is busted.. it turns out AMD gets x86_max_cores wrong when there
are compute units.

Mar 17 17:34:30 myhost kernel: smpboot: CPU 1 APICId 40 disabled
Mar 17 17:34:30 myhost kernel: Switched APIC routing to physical flat.
Mar 17 17:34:30 myhost kernel: ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
Mar 17 17:34:30 myhost kernel: smpboot: CPU0: AMD Opteron(TM) Processor 6274 (family: 0x15, model: 0x1, stepping: 0x2)
Mar 17 17:34:30 myhost kernel: Performance Events: Fam15h core perfctr, Broken BIOS detected, complain to your hardware vendor.
Mar 17 17:34:30 myhost kernel: [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 430076)
Mar 17 17:34:30 myhost kernel: AMD PMU driver.
Mar 17 17:34:30 myhost kernel: ... version: 0
Mar 17 17:34:30 myhost kernel: ... bit width: 48
Mar 17 17:34:30 myhost kernel: ... generic registers: 6
Mar 17 17:34:30 myhost kernel: ... value mask: 0000ffffffffffff
Mar 17 17:34:30 myhost kernel: ... max period: 00007fffffffffff
Mar 17 17:34:30 myhost kernel: ... fixed-purpose events: 0
Mar 17 17:34:30 myhost kernel: ... event mask: 000000000000003f
Mar 17 17:34:30 myhost kernel: NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #17
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #18
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #19
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #20
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #21
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #22
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #23
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #24
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #25
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #26
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #27
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #28
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #29
Mar 17 17:34:30 myhost kernel: .... node #0, CPUs: #30
Mar 17 17:34:30 myhost kernel: .... node #3, CPUs: #31
Mar 17 17:34:30 myhost kernel: x86: Booted up 2 nodes, 31 CPUs

And that is one weird node mapping..


I have a similar system, which after the below patch says:

[ 0.182174] max_cores: 8, cpu_ids: 32, num_siblings: 2, coreid_bits: 5
[ 0.188712] smpboot: Max logical packages: 2
[ 0.192988] smpboot: APIC(20) Converting physical 1 to logical package 0
[ 0.199689] smpboot: APIC(40) Converting physical 2 to logical package 1
[ 0.206405] Switched APIC routing to physical flat.
[ 0.211851] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.329578] smpboot: CPU0: AMD Opteron(tm) Processor 6278 (family: 0x15, model: 0x1, stepping: 0x2)
[ 0.338705] Performance Events: Fam15h core perfctr, AMD PMU driver.
[ 0.345134] ... version: 0
[ 0.349147] ... bit width: 48
[ 0.353262] ... generic registers: 6
[ 0.357274] ... value mask: 0000ffffffffffff
[ 0.362586] ... max period: 00007fffffffffff
[ 0.367900] ... fixed-purpose events: 0
[ 0.371911] ... event mask: 000000000000003f
[ 0.378664] MCE: In-kernel MCE decoding enabled.
[ 0.383965] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[ 0.393079] x86: Booting SMP configuration:
[ 0.397262] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7
[ 0.848764] .... node #1, CPUs: #8 #9 #10 #11 #12 #13 #14 #15
[ 1.364701] .... node #2, CPUs: #16 #17 #18 #19 #20 #21 #22 #23
[ 1.898586] .... node #3, CPUs: #24 #25 #26 #27 #28 #29 #30 #31
[ 2.413417] x86: Booted up 4 nodes, 32 CPUs

Could you please try? I'm not sure how this would explain your loop
device bug fail, but it certainly pointed towards broken.


Andreas; Borislav said to Cc you since you wrote all this.
The issue is that Linux assumes:

nr_logical_cpus = nr_cores * nr_siblings

But AMD reports its CU unit as 2 cores, but then sets num_smp_siblings
to 2 as well.

Thomas; I removed that first branch testing pkg against
__max_logical_packages because if the first pkg id is larger, then the
find_first_zero will find us logical package id 0. However, if the
second pkg id is indeed 0, we'll again claim it without testing if it
was already taken. Also, it fails to print the mapping.


---
arch/x86/kernel/cpu/amd.c | 8 ++++----
arch/x86/kernel/smpboot.c | 11 ++++++-----
2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 97c59fd..6216e80 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -310,9 +310,9 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
node_id = ecx & 7;

/* get compute unit information */
- smp_num_siblings = ((ebx >> 8) & 3) + 1;
+ cores_per_cu = smp_num_siblings = ((ebx >> 8) & 3) + 1;
+ c->x86_max_cores /= smp_num_siblings;
c->compute_unit_id = ebx & 0xff;
- cores_per_cu += ((ebx >> 8) & 3);
} else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
u64 value;

@@ -328,8 +328,8 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
u32 cus_per_node;

set_cpu_cap(c, X86_FEATURE_AMD_DCM);
- cores_per_node = c->x86_max_cores / nodes_per_socket;
- cus_per_node = cores_per_node / cores_per_cu;
+ cus_per_node = c->x86_max_cores / nodes_per_socket;
+ cores_per_node = cus_per_node * cores_per_cu;

/* store NodeID, use llc_shared_map to store sibling info */
per_cpu(cpu_llc_id, cpu) = node_id;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 643dbdc..15c5fda 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -274,11 +274,6 @@ int topology_update_package_map(unsigned int apicid, unsigned int cpu)
if (test_and_set_bit(pkg, physical_package_map))
goto found;

- if (pkg < __max_logical_packages) {
- set_bit(pkg, logical_package_map);
- physical_to_logical_pkg[pkg] = pkg;
- goto found;
- }
new = find_first_zero_bit(logical_package_map, __max_logical_packages);
if (new >= __max_logical_packages) {
physical_to_logical_pkg[pkg] = -1;
@@ -314,6 +309,12 @@ static void __init smp_init_package_map(void)
unsigned int ncpus, cpu;
size_t size;

+ printk("max_cores: %d, cpu_ids: %d, num_siblings: %d, coreid_bits: %d\n",
+ boot_cpu_data.x86_max_cores,
+ nr_cpu_ids,
+ smp_num_siblings,
+ boot_cpu_data.x86_coreid_bits);
+
/*
* Today neither Intel nor AMD support heterogenous systems. That
* might change in the future....