x86 PMU broken in current Linus' tree

From: Jiri Kosina
Date: Tue Aug 02 2016 - 11:41:25 EST


With current Linus' tree (HEAD == 731c7d3a20), I am getting bogus MSR
write warning during bootup, and kernel panic when shutting PMUs down
during poweroff.

The MSR warning is below, the camera capture of the poweroff panic can be
found at

http://www.jikos.cz/jikos/junk/pmu-panic.jpg

The last previous kernel version that I've booted on this particular
machine was 4.7.0-rc4, and it had neither of those symptoms, so I can
eventually bisect if needed.

=== [ snip ] ==
[ 0.136000] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz (family: 0x6, model: 0x17, stepping: 0x6)
[ 0.136000] Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
[ 0.136000] ... version: 2
[ 0.136000] ... bit width: 40
[ 0.136000] ... generic registers: 2
[ 0.136000] ... value mask: 000000ffffffffff
[ 0.136000] ... max period: 000000007fffffff
[ 0.136000] ... fixed-purpose events: 3
[ 0.136000] ... event mask: 0000000700000003
[ 0.136000] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[ 0.136000] unchecked MSR access error: WRMSR to 0xdf (tried to write 0x000000ff80000001) at rIP: 0xffffffff90004acc (x86_perf_event_set_period+0xdc/0x190)
[ 0.136000] ffff8e39f941b000 ffff8e39fbc0a440 000000000000001e ffff8e39fbc0a660
[ 0.136000] ffff8e39f9523b78 ffffffff90004bd0 ffff8e39f941b000 ffff8e39fbc0a760
[ 0.136000] ffff8e39fbc0a440 ffff8e39f9523bc0 ffffffff900053ef 00000000f951c2c0
[ 0.136000] Call Trace:
[ 0.136000] [<ffffffff90004bd0>] x86_pmu_start+0x50/0x110
[ 0.136000] [<ffffffff900053ef>] x86_pmu_enable+0x27f/0x2f0
[ 0.136000] [<ffffffff90175642>] perf_pmu_enable+0x22/0x30
[ 0.136000] [<ffffffff901766b1>] ctx_resched+0x51/0x60
[ 0.136000] [<ffffffff901768a0>] __perf_event_enable+0x1e0/0x240
[ 0.136000] [<ffffffff9016e5f9>] event_function+0xa9/0x180
[ 0.136000] [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
[ 0.136000] [<ffffffff9016fcef>] remote_function+0x3f/0x50
[ 0.136000] [<ffffffff901012b6>] generic_exec_single+0xb6/0x140
[ 0.136000] [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
[ 0.136000] [<ffffffff9016fcb0>] ? perf_cgroup_attach+0x50/0x50
[ 0.136000] [<ffffffff901013f7>] smp_call_function_single+0xb7/0x110
[ 0.136000] [<ffffffff9016e984>] cpu_function_call+0x34/0x40
[ 0.136000] [<ffffffff9016e550>] ? list_del_event+0x150/0x150
[ 0.136000] [<ffffffff9016ecda>] event_function_call+0x11a/0x120
[ 0.136000] [<ffffffff901766c0>] ? ctx_resched+0x60/0x60
[ 0.136000] [<ffffffff9016ed79>] _perf_event_enable+0x49/0x70
[ 0.136000] [<ffffffff901736ac>] perf_event_enable+0x1c/0x40
[ 0.136000] [<ffffffff9013cad2>] watchdog_enable+0x132/0x1d0
[ 0.136000] [<ffffffff90092440>] smpboot_thread_fn+0xe0/0x1d0
[ 0.136000] [<ffffffff90092360>] ? sort_range+0x30/0x30
[ 0.136000] [<ffffffff9008e7e2>] kthread+0xf2/0x110
[ 0.136000] [<ffffffff9069e611>] ? wait_for_completion+0xe1/0x110
[ 0.136000] [<ffffffff906a3b2f>] ret_from_fork+0x1f/0x40
[ 0.136000] [<ffffffff9008e6f0>] ? kthread_create_on_node+0x220/0x220
=== [ snip ] ===

--
Jiri Kosina
SUSE Labs