[RFC] perf/x86: Fix a warning on x86_pmu_stop()

From: Namhyung Kim
Date: Fri Nov 20 2020 - 21:50:42 EST


When large PEBS is enabled, the below warning is triggered:

[6070379.453697] WARNING: CPU: 23 PID: 42379 at arch/x86/events/core.c:1466 x86_pmu_stop+0x95/0xa0
...
[6070379.453831] Call Trace:
[6070379.453840] x86_pmu_del+0x50/0x150
[6070379.453845] event_sched_out.isra.0+0x95/0x200
[6070379.453848] group_sched_out.part.0+0x53/0xd0
[6070379.453851] __perf_event_disable+0xee/0x1e0
[6070379.453854] event_function+0x89/0xd0
[6070379.453859] remote_function+0x3e/0x50
[6070379.453866] generic_exec_single+0x91/0xd0
[6070379.453870] smp_call_function_single+0xd1/0x110
[6070379.453874] event_function_call+0x11c/0x130
[6070379.453877] ? task_ctx_sched_out+0x20/0x20
[6070379.453880] ? perf_mux_hrtimer_handler+0x370/0x370
[6070379.453882] ? event_function_call+0x130/0x130
[6070379.453886] perf_event_for_each_child+0x34/0x80
[6070379.453889] ? event_function_call+0x130/0x130
[6070379.453891] _perf_ioctl+0x24b/0x6a0
[6070379.453898] ? sched_setaffinity+0x1ad/0x2a0
[6070379.453904] ? _cond_resched+0x15/0x30
[6070379.453906] perf_ioctl+0x3d/0x60
[6070379.453912] ksys_ioctl+0x87/0xc0
[6070379.453917] __x64_sys_ioctl+0x16/0x20
[6070379.453923] do_syscall_64+0x52/0x180
[6070379.453928] entry_SYSCALL_64_after_hwframe+0x44/0xa9

The commit 3966c3feca3f ("x86/perf/amd: Remove need to check "running"
bit in NMI handler") introduced this. It seems x86_pmu_stop can be
called recursively (like when it losts some samples) like below:

x86_pmu_stop
intel_pmu_disable_event (x86_pmu_disable)
intel_pmu_pebs_disable
intel_pmu_drain_pebs_buffer
x86_pmu_stop

It seems the change is only needed for AMD. So I added a new bit to
check when it should clear the active mask.

Fixes: 3966c3feca3f ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
Reported-by: John Sperbeck <jsperbeck@xxxxxxxxxx>
Cc: "Lendacky, Thomas" <Thomas.Lendacky@xxxxxxx>
Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
---
arch/x86/events/amd/core.c | 1 +
arch/x86/events/core.c | 9 +++++++--
arch/x86/events/perf_event.h | 3 ++-
3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 39eb276d0277..c545fbd423df 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -927,6 +927,7 @@ static __initconst const struct x86_pmu amd_pmu = {
.max_period = (1ULL << 47) - 1,
.get_event_constraints = amd_get_event_constraints,
.put_event_constraints = amd_put_event_constraints,
+ .late_nmi = 1,

.format_attrs = amd_format_attr,
.events_sysfs_show = amd_event_sysfs_show,
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7b802a778014..a6c12bd71e66 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1514,8 +1514,13 @@ void x86_pmu_stop(struct perf_event *event, int flags)
struct hw_perf_event *hwc = &event->hw;

if (test_bit(hwc->idx, cpuc->active_mask)) {
- x86_pmu.disable(event);
- __clear_bit(hwc->idx, cpuc->active_mask);
+ if (x86_pmu.late_nmi) {
+ x86_pmu.disable(event);
+ __clear_bit(hwc->idx, cpuc->active_mask);
+ } else {
+ __clear_bit(hwc->idx, cpuc->active_mask);
+ x86_pmu.disable(event);
+ }
cpuc->events[hwc->idx] = NULL;
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 10032f023fcc..1ffaa0fcd521 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -682,7 +682,8 @@ struct x86_pmu {
/* PMI handler bits */
unsigned int late_ack :1,
enabled_ack :1,
- counter_freezing :1;
+ counter_freezing :1,
+ late_nmi :1;
/*
* sysfs attrs
*/
--
2.29.2.454.gaff20da3a2-goog