RE: [PATCH V6 0/6] Intel memory b/w monitoring support

From: Luck, Tony
Date: Fri Mar 11 2016 - 18:45:30 EST


> Please see if the branch below works for you:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core

tragically no :-( The instant I started perf stat to trace some MBM events, I got a panic.

But I think something went awry with the base version you applied these patches to. I see
a whole lot of differences between what the tree you pointed to and the version I have.

-Tony

[ 320.988548] BUG: unable to handle kernel paging request at ffff888f153d3b88
[ 320.998399] IP: [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.006622] PGD 1f88067 PUD 0
[ 321.011910] Oops: 0000 [#1] SMP
[ 321.017328] Modules linked in: af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) msr(E) xfs(E) libcrc32c(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) joydev(E) dm_mod(E) ixgbe(E) kvm(E) irqbypass(E) ptp(E) crct10dif_pclmul(E) crc32_pclmul(E) iTCO_wdt(E) pps_core(E) mptctl(E) ghash_clmulni_intel(E) iTCO_vendor_support(E) mdio(E) mptbase(E) dca(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) pcspkr(E) sb_edac(E) mei_me(E) lpc_ich(E) mei(E) mfd_core(E) edac_core(E) i2c_i801(E) wmi(E) shpchp(E) ipmi_si(E) ipmi_msghandler(E) processor(E) acpi_pad(E) button(E) efivarfs(E) btrfs(E) xor(E) raid6_pq(E) sd_mod(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) mgag200(E) i2c_algo_bit(E) ahci(E) drm_kms_helper(E) syscopyarea(E) libahci(E) sysfillrect(E) ehci_pci(E) ehci_hcd(E) sysimgblt(E) fb_sys_fops(E) ttm(E) crc32c_intel(E) mpt3sas(E) usbcore(E) raid_class(E) drm(E) libata(E) usb_common(E) scsi_transport_sas(E) sg(E) scsi_mod(E) autofs4(E)
[ 321.136529] CPU: 72 PID: 0 Comm: swapper/72 Tainted: G E 4.5.0-rc6-371-g520a80bcb13b #2
[ 321.148713] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0336.V05.1603031638 03/03/2016
[ 321.162290] task: ffff881ff29994c0 ti: ffff881ff299c000 task.ti: ffff881ff299c000
[ 321.172684] RIP: 0010:[<ffffffff8100afff>] [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.183790] RSP: 0018:ffff887fff003f28 EFLAGS: 00010046
[ 321.191794] RAX: 0000000000000000 RBX: ffff888f153d3b80 RCX: 0000000000000000
[ 321.201872] RDX: 0000000000000000 RSI: ffff887fff003f2c RDI: 0000000000000c8e
[ 321.211928] RBP: ffff887fff003f40 R08: 000000000000001c R09: 0000000000004a23
[ 321.221990] R10: 00000000000000d8 R11: 0000000000000005 R12: 0000000000000000
[ 321.232047] R13: ffffffff8100b0b0 R14: ffff883ff2533d60 R15: 0000004a5791b2c7
[ 321.242113] FS: 0000000000000000(0000) GS:ffff887fff000000(0000) knlGS:0000000000000000
[ 321.253262] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 321.261778] CR2: ffff888f153d3b88 CR3: 0000000001a0a000 CR4: 00000000003406e0
[ 321.271856] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 321.281937] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 321.292012] Stack:
[ 321.296348] 00000000810e0d58 ffff883ff2533d60 0000000000000000 ffff887fff003f58
[ 321.306804] ffffffff8100b0c9 ffffe8ffff8037c0 ffff887fff003f88 ffffffff810e562c
[ 321.317281] ffffe8ffff80f500 0000000000000002 0000000000000048 ffffffff81ad0a20
[ 321.327761] Call Trace:
[ 321.332639] <IRQ>
[ 321.334805] [<ffffffff8100b0c9>] __intel_mbm_event_count+0x19/0x30
[ 321.346244] [<ffffffff810e562c>] flush_smp_call_function_queue+0x4c/0x130
[ 321.356030] [<ffffffff810e6053>] generic_smp_call_function_single_interrupt+0x13/0x60
[ 321.366978] [<ffffffff8103ac07>] smp_call_function_interrupt+0x27/0x40
[ 321.376469] [<ffffffff815bd132>] call_function_interrupt+0x82/0x90
[ 321.385568] <EOI>
[ 321.387735] [<ffffffff81486fc5>] ? cpuidle_enter_state+0xd5/0x250
[ 321.399053] [<ffffffff81486fa1>] ? cpuidle_enter_state+0xb1/0x250
[ 321.408053] [<ffffffff81487177>] cpuidle_enter+0x17/0x20
[ 321.416184] [<ffffffff810aa68d>] cpu_startup_entry+0x25d/0x350
[ 321.424901] [<ffffffff8103b793>] start_secondary+0x113/0x140
[ 321.433439] Code: 04 00 66 90 bf 8e 0c 00 00 48 8d 75 ec e8 ea 35 04 00 66 90 48 ba 00 00 00 00 00 00 00 c0 48 85 d0 75 48 45 85 e4 75 2d 48 89 c2 <48> 2b 53 08 8b 0d 5b 93 d2 00 48 89 43 08 81 e2 ff ff ff 00 48
[ 321.459674] RIP [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.468328] RSP <ffff887fff003f28>
[ 321.474436] CR2: ffff888f153d3b88
[ 321.490193] ---[ end trace f73c5e7e5070d07b ]---
[ 321.490200] BUG: unable to handle kernel paging request at ffff888f153d2388
[ 321.490214] IP: [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.490216] PGD 1f88067 PUD 0
[ 321.490219] Oops: 0000 [#2] SMP
[ 321.490265] Modules linked in: af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) msr(E) xfs(E) libcrc32c(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) joydev(E) dm_mod(E) ixgbe(E) kvm(E) irqbypass(E) ptp(E) crct10dif_pclmul(E) crc32_pclmul(E) iTCO_wdt(E) pps_core(E) mptctl(E) ghash_clmulni_intel(E) iTCO_vendor_support(E) mdio(E) mptbase(E) dca(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) pcspkr(E) sb_edac(E) mei_me(E) lpc_ich(E) mei(E) mfd_core(E) edac_core(E) i2c_i801(E) wmi(E) shpchp(E) ipmi_si(E) ipmi_msghandler(E) processor(E) acpi_pad(E) button(E) efivarfs(E) btrfs(E) xor(E) raid6_pq(E) sd_mod(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) mgag200(E) i2c_algo_bit(E) ahci(E) drm_kms_helper(E) syscopyarea(E) libahci(E) sysfillrect(E) ehci_pci(E) ehci_hcd(E) sysimgblt(E) fb_sys_fops(E) ttm(E) crc32c_intel(E) mpt3sas(E) usbcore(E) raid_class(E) drm(E) libata(E) usb_common(E) scsi_transport_sas(E) sg(E) scsi_mod(E) autofs4(E)
[ 321.490282] CPU: 24 PID: 0 Comm: swapper/24 Tainted: G D E 4.5.0-rc6-371-g520a80bcb13b #2
[ 321.490283] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0336.V05.1603031638 03/03/2016
[ 321.490285] task: ffff881ff280c8c0 ti: ffff881ff2810000 task.ti: ffff881ff2810000
[ 321.490290] RIP: 0010:[<ffffffff8100afff>] [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.490292] RSP: 0018:ffff883fff803f28 EFLAGS: 00010046
[ 321.490293] RAX: 0000000000000000 RBX: ffff888f153d2380 RCX: 0000000000000000
[ 321.490294] RDX: 0000000000000000 RSI: ffff883fff803f2c RDI: 0000000000000c8e
[ 321.490295] RBP: ffff883fff803f40 R08: 000000000000001c R09: 0000000000004a2e
[ 321.490296] R10: 00000000000000f4 R11: 0000000000000000 R12: 0000000000000000
[ 321.490297] R13: ffffffff8100b0b0 R14: ffff883ff2533d60 R15: 0000004a5791a454
[ 321.490299] FS: 0000000000000000(0000) GS:ffff883fff800000(0000) knlGS:0000000000000000
[ 321.490300] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 321.490301] CR2: ffff888f153d2388 CR3: 0000000001a0a000 CR4: 00000000003406e0
[ 321.490302] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 321.490303] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 321.490304] Stack:
[ 321.490307] 00000000810e0d58 ffff883ff2533d60 0000000000000000 ffff883fff803f58
[ 321.490309] ffffffff8100b0c9 ffffe8c0000037c0 ffff883fff803f88 ffffffff810e562c
[ 321.490311] ffffe8c00000f500 0000000000000002 0000000000000018 ffffffff81ad0a20
[ 321.490311] Call Trace:
[ 321.490319] <IRQ>
[ 321.490319] [<ffffffff8100b0c9>] __intel_mbm_event_count+0x19/0x30
[ 321.490327] [<ffffffff810e562c>] flush_smp_call_function_queue+0x4c/0x130
[ 321.490330] [<ffffffff810e6053>] generic_smp_call_function_single_interrupt+0x13/0x60
[ 321.490337] [<ffffffff8103ac07>] smp_call_function_interrupt+0x27/0x40
[ 321.490347] [<ffffffff815bd132>] call_function_interrupt+0x82/0x90
[ 321.490356] <EOI>
[ 321.490356] [<ffffffff81486fc5>] ? cpuidle_enter_state+0xd5/0x250
[ 321.490359] [<ffffffff81486fa1>] ? cpuidle_enter_state+0xb1/0x250
[ 321.490362] [<ffffffff81487177>] cpuidle_enter+0x17/0x20
[ 321.490369] [<ffffffff810aa68d>] cpu_startup_entry+0x25d/0x350
[ 321.490372] [<ffffffff8103b793>] start_secondary+0x113/0x140
[ 321.490399] Code: 04 00 66 90 bf 8e 0c 00 00 48 8d 75 ec e8 ea 35 04 00 66 90 48 ba 00 00 00 00 00 00 00 c0 48 85 d0 75 48 45 85 e4 75 2d 48 89 c2 <48> 2b 53 08 8b 0d 5b 93 d2 00 48 89 43 08 81 e2 ff ff ff 00 48
[ 321.490402] RIP [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.490403] RSP <ffff883fff803f28>
[ 321.490404] CR2: ffff888f153d2388
[ 321.490409] ---[ end trace f73c5e7e5070d07c ]---
[ 321.490417] BUG: unable to handle kernel paging request at ffff888f153d1788
[ 321.491691] IP: [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.491695] PGD 1f88067 PUD 0
[ 321.491698] Oops: 0000 [#3] SMP
[ 321.495324] Modules linked in: af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) msr(E) xfs(E) libcrc32c(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E)
[ 321.495324] Kernel panic - not syncing: Fatal exception in interrupt
[ 321.495374] coretemp(E) joydev(E) dm_mod(E) ixgbe(E) kvm(E) irqbypass(E) ptp(E) crct10dif_pclmul(E) crc32_pclmul(E) iTCO_wdt(E) pps_core(E) mptctl(E) ghash_clmulni_intel(E) iTCO_vendor_support(E) mdio(E) mptbase(E) dca(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) pcspkr(E) sb_edac(E) mei_me(E) lpc_ich(E) mei(E) mfd_core(E) edac_core(E) i2c_i801(E) wmi(E) shpchp(E) ipmi_si(E) ipmi_msghandler(E) processor(E) acpi_pad(E) button(E) efivarfs(E) btrfs(E) xor(E) raid6_pq(E) sd_mod(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) mgag200(E) i2c_algo_bit(E) ahci(E) drm_kms_helper(E) syscopyarea(E) libahci(E) sysfillrect(E) ehci_pci(E) ehci_hcd(E) sysimgblt(E) fb_sys_fops(E) ttm(E) crc32c_intel(E) mpt3sas(E) usbcore(E) raid_class(E) drm(E) libata(E) usb_common(E) scsi_transport_sas(E) sg(E) scsi_mod(E) autofs4(E)
[ 321.495383] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D E 4.5.0-rc6-371-g520a80bcb13b #2
[ 321.495385] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0336.V05.1603031638 03/03/2016
[ 321.495386] task: ffffffff81a0f4c0 ti: ffffffff81a00000 task.ti: ffffffff81a00000
[ 321.495392] RIP: 0010:[<ffffffff8100afff>] [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.495393] RSP: 0018:ffff881fff803f28 EFLAGS: 00010046
[ 321.495395] RAX: 0000000000013c38 RBX: ffff888f153d1780 RCX: 0000000000000000
[ 321.495396] RDX: 0000000000013c38 RSI: ffff881fff803f2c RDI: 0000000000000c8e
[ 321.495397] RBP: ffff881fff803f40 R08: 0000000000000018 R09: 0000000000069083
[ 321.495398] R10: 0000000000004aa6 R11: 0000000000000000 R12: 0000000000000000
[ 321.495399] R13: ffffffff8100b0b0 R14: ffff883ff2533d60 R15: 0000004a57919508
[ 321.495401] FS: 0000000000000000(0000) GS:ffff881fff800000(0000) knlGS:0000000000000000
[ 321.495402] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 321.495403] CR2: ffff888f153d1788 CR3: 0000000001a0a000 CR4: 00000000003406f0
[ 321.495404] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 321.495405] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 321.495406] Stack:
[ 321.495409] 00000000810e0d58 ffff883ff2533d60 0000000000000000 ffff881fff803f58
[ 321.495411] ffffffff8100b0c9 ffffe8a0000037c0 ffff881fff803f88 ffffffff810e562c
[ 321.495413] ffffe8a00000f500 0000000000000004 0000000000000000 ffffffff81ad0a20
[ 321.495414] Call Trace:
[ 321.495421] <IRQ>
[ 321.495421] [<ffffffff8100b0c9>] __intel_mbm_event_count+0x19/0x30
[ 321.495426] [<ffffffff810e562c>] flush_smp_call_function_queue+0x4c/0x130
[ 321.495429] [<ffffffff810e6053>] generic_smp_call_function_single_interrupt+0x13/0x60
[ 321.495434] [<ffffffff8103ac07>] smp_call_function_interrupt+0x27/0x40
[ 321.495440] [<ffffffff815bd132>] call_function_interrupt+0x82/0x90
[ 321.495447] <EOI>
[ 321.495448] [<ffffffff81486fc5>] ? cpuidle_enter_state+0xd5/0x250
[ 321.495450] [<ffffffff81486fa1>] ? cpuidle_enter_state+0xb1/0x250
[ 321.495453] [<ffffffff81487177>] cpuidle_enter+0x17/0x20
[ 321.495457] [<ffffffff810aa68d>] cpu_startup_entry+0x25d/0x350
[ 321.495463] [<ffffffff815b03dc>] rest_init+0x7c/0x80
[ 321.495472] [<ffffffff81b5d0be>] start_kernel+0x486/0x493
[ 321.495475] [<ffffffff81b5ca26>] ? set_init_arg+0x55/0x55
[ 321.495479] [<ffffffff81b5c120>] ? early_idt_handler_array+0x120/0x120
[ 321.495482] [<ffffffff81b5c5ca>] x86_64_start_reservations+0x2a/0x2c
[ 321.495485] [<ffffffff81b5c709>] x86_64_start_kernel+0x13d/0x14c
[ 321.495511] Code: 04 00 66 90 bf 8e 0c 00 00 48 8d 75 ec e8 ea 35 04 00 66 90 48 ba 00 00 00 00 00 00 00 c0 48 85 d0 75 48 45 85 e4 75 2d 48 89 c2 <48> 2b 53 08 8b 0d 5b 93 d2 00 48 89 43 08 81 e2 ff ff ff 00 48
[ 321.495515] RIP [<ffffffff8100afff>] update_sample+0x8f/0xf0
[ 321.495516] RSP <ffff881fff803f28>
[ 321.495517] CR2: ffff888f153d1788
[ 321.495520] ---[ end trace f73c5e7e5070d07d ]---
[ 322.567367] Shutting down cpus with NMI
[ 322.579083] Kernel Offset: disabled
[ 322.603886] ---[ end Kernel panic - not syncing: Fatal exception in interrupt