Re: [PATCH 00/12] perf: more fixes

From: Peter Zijlstra
Date: Thu Mar 10 2016 - 09:39:56 EST

On Wed, Feb 24, 2016 at 06:45:39PM +0100, Peter Zijlstra wrote:

> With these patches syz-kaller can still trigger some fail; most notably some
> NMI watchdog triggers and a very sporadic unthrottle bug (much like last time).

So the below seems to make the sporadic unthrottle thing much less
likely in that I haven't seen it in several hours, my machine keeps
dying on NMI watchdog bits.

Boris, who has been running syz-kaller on AMD hardware and was hitting a
very similar bug with the AMD-IBS code, says its not fixed it for him,
so maybe there's still more to find.

Subject: perf: Fix unthrottle

Its possible to IOC_PERIOD while the event is throttled, this would
re-start the event and the next tick would then try to unthrottle it,
and find the event wasn't actually stopped anymore.

This would tickle a WARN in the x86-pmu code which isn't expecting to
start a !stopped event.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
kernel/events/core.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 712570dddacd..d39477390415 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4210,6 +4210,14 @@ static void __perf_event_period(struct perf_event *event,
active = (event->state == PERF_EVENT_STATE_ACTIVE);
if (active) {
+ /*
+ * We could be throttled; unthrottle now to avoid the tick
+ * trying to unthrottle while we already re-started the event.
+ */
+ if (event->hw.interrupts == MAX_INTERRUPTS) {
+ event->hw.interrupts = 0;
+ perf_log_throttle(event, 1);
+ }
event->pmu->stop(event, PERF_EF_UPDATE);