PROBLEM: lk 4.5 oops on boot with Xeon D-1520

From: Tony Battersby
Date: Wed Feb 17 2016 - 18:01:37 EST


The following commit in 4.5 is causing a general protection fault during
early boot:

d6980ef32570 ("perf/x86/intel/uncore: Add Broadwell-EP uncore support")

With the commit reverted, the system boots fine.

CPU: Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
Motherboard: Supermicro X10SDV-4C-TLN2F

The general protection fault occurs when
hswep_uncore_sbox_msr_init_box() calls wrmsrl(). I added a printk to
get the following values just before the oops:

box->pmu->type->box_ctl = 1824
box->pmu->pmu_idx = 0
box->pmu->type->msr_offset = 10
box->pmu->type->msr_offsets = NULL
msr = 1824
(all values are decimal)

Here is the call trace:
hswep_uncore_sbox_msr_init_box+0x7c/0xc0 (RIP)
uncore_cpu_starting+0x8a/0x1c0
? uncore_change_context+0xe5/0x150
? uncore_types_init+0x1d6/0x1d6
uncore_cpu_setup+0x10/0x12
on_each_cpu+0x32/0x50
intel_uncore_init+0x2e8/0x36d
? cstate_pmu_init+0x14f/0x195
? uncore_cpu_setup+0x12/0x12

I have a jpg image of the monitor displaying the full oops; let me know
if anyone wants that.

----------

/proc/cpuinfo:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 4
initial apicid : 4
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 6
initial apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 4
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 6
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 5
initial apicid : 5
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 86
model name : Intel(R) Xeon(R) CPU D-1520 @ 2.20GHz
stepping : 2
microcode : 0xa
cpu MHz : 2200.000
tsc MHz : 2199.998
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept
vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm
rdseed adx smap xsaveopt cqm_llc cqm_occup_llc dtherm arat pln pts
bugs :
bogomips : 4399.57
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

----------