RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

From: Vince Weaver
Date: Mon Jul 06 2015 - 12:45:31 EST


On Mon, 6 Jul 2015, Liang, Kan wrote:

>
> > On Mon, 6 Jul 2015, Liang, Kan wrote:
> >
> > >
> > > > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> > > > > If we cleared the last bit, we not only drain the buffer but also
> > > > > decrease the event->ctx->pmu, which is used to flush the PEBS
> > > > > buffer during context switches.
> > > > > We need to disable cpuc->pebs_enabled before changing
> > > > > event->ctx->pmu as below.
> > > > >
> > > >
> > > > Indeed, mind sending a proper patch so I can press 'A' on it?
> > >
> > > Sure, I will do that.
> > > But I didn't verify the patch, since I cannot reproduce the issue.
> > >
> > > Vince, would you mind testing the patch?
> > > If the issue is gone, I will send a proper patch then.
> >
> > I've got too many patches floating around so I forget what this one is trying
> > to fix. The pebs related lockup? Or the warning?
> >
>
> It's trying to fix the warning issue as below.
> For the test result, it looks the patch doesn't help, does it?
>
> > > WARN_ON_ONCE(!event->attr.precise_ip);
> > >
> > > [ 584.352324] WARNING: CPU: 2 PID: 18924 at
> > > arch/x86/kernel/cpu/perf_event_intel_ds.c:1198
> > > intel_pmu_drain_pebs_nhm+0x283/0x2e0()
>
>
> > Was this patch meant to be in addition to PeterZ's, or standalone?
> >
>
> Standalone.
>
> > Also please send proper patches in the future, this one was whitespace
> > damaged and a pain to get applied.
> >
> > With just this patch applied (without PeterZ's) I still managed to trigger the
> > following warning.
>
> Thanks for the test.

The machine also crashed a few minutes later.

[ 2972.105858] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 7, t=5482 jiffies, g=338012, c=338011, q=205)
[ 2972.118544] Task dump for CPU 3:
[ 2972.122706] perf_fuzzer R running task 0 9409 2404 0x0000000c
[ 2972.131021] 0000000000000092 ffffffff81030dfb 0000000300000005 0000000400000004
[ 2972.139762] ffff88011eacc4a0 0000000000000005 0000000000000092 0000000500000002
[ 2972.148507] ffffffff81030e6b ffff8801197b2800 ffff8801197b2800 ffff88011eacbd80
[ 2972.157225] Call Trace:
[ 2972.160547] [<ffffffff81030dfb>] ? intel_start_scheduling+0x4b/0x70
[ 2972.168098] [<ffffffff81030e6b>] ? intel_stop_scheduling+0x4b/0x70
[ 2972.175513] [<ffffffff816a853b>] ? _raw_spin_unlock+0x2b/0x40
[ 2972.182502] [<ffffffff81030e6b>] ? intel_stop_scheduling+0x4b/0x70
[ 2972.189920] [<ffffffff81029f32>] ? x86_schedule_events+0x1e2/0x260
[ 2972.197340] [<ffffffff810b78f6>] ? __lock_acquire.isra.31+0x3a6/0xf90
[ 2972.204998] [<ffffffff810b78f6>] ? __lock_acquire.isra.31+0x3a6/0xf90
[ 2972.212670] [<ffffffff811585b2>] ? perf_event_update_userpage+0x102/0x170
[ 2972.220680] [<ffffffff811585ca>] ? perf_event_update_userpage+0x11a/0x170
[ 2972.228699] [<ffffffff811584b0>] ? perf_event_task_disable+0xd0/0xd0
[ 2972.236281] [<ffffffff81031d7b>] ? intel_pmu_enable_event+0xfb/0x210
[ 2972.243882] [<ffffffff810303a4>] ? intel_pmu_pebs_enable_all+0x34/0x40
[ 2972.251652] [<ffffffff810309ed>] ? __intel_pmu_enable_all+0x8d/0xc0
[ 2972.259115] [<ffffffff81030a30>] ? intel_pmu_enable_all+0x10/0x20
[ 2972.266402] [<ffffffff8102a95c>] ? x86_pmu_enable+0x25c/0x2e0
[ 2972.273316] [<ffffffff81156202>] ? perf_pmu_enable+0x22/0x30
[ 2972.280152] [<ffffffff81157da1>] ? __perf_install_in_context+0x131/0x1d0
[ 2972.288087] [<ffffffff811533f2>] ? remote_function+0x42/0x50
[ 2972.294856] [<ffffffff810f1046>] ? generic_exec_single+0xb6/0x120
[ 2972.302156] [<ffffffff8115d13a>] ? SYSC_perf_event_open+0xb4a/0xd40
[ 2972.309584] [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
[ 2972.316895] [<ffffffff810f1160>] ? smp_call_function_single+0xb0/0x110
[ 2972.324617] [<ffffffff81152484>] ? task_function_call+0x44/0x50
[ 2972.331698] [<ffffffff81157c70>] ? perf_mux_hrtimer_handler+0x1f0/0x1f0
[ 2972.339490] [<ffffffff81152843>] ? perf_install_in_context+0x83/0xf0
[ 2972.347014] [<ffffffff8115d171>] ? SYSC_perf_event_open+0xb81/0xd40
[ 2972.354436] [<ffffffff8115d7a9>] ? SyS_perf_event_open+0x9/0x10
[ 2972.361469] [<ffffffff816a8df2>] ? entry_SYSCALL_64_fastpath+0x16/0x7a
[ 2973.026502] ------------[ cut here ]------------
[ 2973.031822] WARNING: CPU: 3 PID: 9409 at kernel/watchdog.c:311 watchdog_overflow_callback+0x84/0xa0()
[ 2973.042038] Watchdog detected hard LOCKUP on cpu 3
[ 2973.124350] CPU: 3 PID: 9409 Comm: perf_fuzzer Tainted: G W 4.2.0-rc1+ #166
[ 2973.133447] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 2973.141844] ffffffff81a28ae2 ffff88011eac5af0 ffffffff816a10a3 0000000000000000
[ 2973.150325] ffff88011eac5b40 ffff88011eac5b30 ffffffff8106ec8a ffff88011eac5c40
[ 2973.158787] ffff880119133800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8
[ 2973.167248] Call Trace:
[ 2973.170317] <NMI> [<ffffffff816a10a3>] dump_stack+0x45/0x57
[ 2973.176960] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
[ 2973.183818] [<ffffffff8106ed06>] warn_slowpath_fmt+0x46/0x50
[ 2973.190416] [<ffffffff8102f676>] ? intel_pmu_drain_pebs_nhm+0x176/0x2e0
[ 2973.198013] [<ffffffff8111b784>] watchdog_overflow_callback+0x84/0xa0
[ 2973.205422] [<ffffffff8115af7c>] __perf_event_overflow+0x8c/0x1c0
[ 2973.212460] [<ffffffff8115bae4>] perf_event_overflow+0x14/0x20
[ 2973.219226] [<ffffffff810321b4>] intel_pmu_handle_irq+0x1d4/0x440
[ 2973.226340] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
[ 2973.233400] [<ffffffff810181ad>] nmi_handle+0x9d/0x140
[ 2973.239427] [<ffffffff81018115>] ? nmi_handle+0x5/0x140
[ 2973.245540] [<ffffffff810184b9>] default_do_nmi+0xc9/0x120
[ 2973.251932] [<ffffffff8101859d>] do_nmi+0x8d/0xc0
[ 2973.257507] [<ffffffff816ab01f>] end_repeat_nmi+0x1e/0x2e
[ 2973.263846] [<ffffffff81035d76>] ? intel_bts_enable_local+0x26/0x40
[ 2973.271087] [<ffffffff81035d76>] ? intel_bts_enable_local+0x26/0x40
[ 2973.278330] [<ffffffff81035d76>] ? intel_bts_enable_local+0x26/0x40
[ 2973.285556] <<EOE>> [<ffffffff810309ed>] ? __intel_pmu_enable_all+0x8d/0xc0
[ 2973.293646] [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20
[ 2973.300538] [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0
[ 2973.307048] [<ffffffff81156202>] perf_pmu_enable+0x22/0x30
[ 2973.313470] [<ffffffff81157da1>] __perf_install_in_context+0x131/0x1d0
[ 2973.320973] [<ffffffff811533f2>] remote_function+0x42/0x50
[ 2973.327406] [<ffffffff810f1046>] generic_exec_single+0xb6/0x120
[ 2973.334300] [<ffffffff8115d13a>] ? SYSC_perf_event_open+0xb4a/0xd40
[ 2973.341513] [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
[ 2973.348676] [<ffffffff810f1160>] smp_call_function_single+0xb0/0x110
[ 2973.355976] [<ffffffff81152484>] task_function_call+0x44/0x50
[ 2973.362705] [<ffffffff81157c70>] ? perf_mux_hrtimer_handler+0x1f0/0x1f0
[ 2973.370291] [<ffffffff81152843>] perf_install_in_context+0x83/0xf0
[ 2973.377478] [<ffffffff8115d171>] SYSC_perf_event_open+0xb81/0xd40
[ 2973.384552] [<ffffffff8115d7a9>] SyS_perf_event_open+0x9/0x10
[ 2973.391242] [<ffffffff816a8df2>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 2973.398575] ---[ end trace a75b257dea18211c ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/