Re: perf_events: proposed fix for broken intr throttling (repost)

From: Peter Zijlstra
Date: Wed Jan 04 2012 - 16:23:41 EST


On Wed, 2012-01-04 at 15:39 +0100, Stephane Eranian wrote:

> In running some tests with 3.2.0-rc7-tip, I noticed unexpected throttling
> notification samples. I was using fixed period with a long enough period
> that I could not possibly hit the default limit of 100000 samples/sec/cpu.
>
> I investigated the matter and discovered that the following commit
> is the culprit:
>
> commit 0f5a2601284237e2ba089389fd75d67f77626cef
> Author: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Date: Wed Nov 16 14:38:16 2011 +0100
>
> perf: Avoid a useless pmu_disable() in the perf-tick
>
>
> The throttling mechanism REQUIRES that the hwc->interrupt counter be reset
> at EACH timer tick. This is regardless of the fact that the counter is in fixed
> period or frequency mode. The optimization introduced in this patch breaks this
> by avoiding calling perf_ctx_adjust_freq() at each timer tick. For events with
> fixed period, it would not adjust any period at all BUT it would reset the
> throttling counter.
>
> Given the way the throttling mechanism is implemented we cannot avoid doing
> some work at each timer tick. Otherwise we loose many samples for no good
> reasons.
>
> One may also question the motivation behind checking the interrupt rate at
> each timer tick rather than every second, i.e., average it out over a longer
> period.

That also allows your system to be dead for longer..

> I see two solutions short term:
> 1 - revert the commit above
> 2 - special case the situation with no frequency-based sampling event
>
> I have implemented solution 2 with the draft fix below. It does not invoke
> perf_pmu_enable()/perf_pmu_disable(). I am not clear on whether or not this
> is really needed in this case. Please advise.

I don't think it needs that, I do dislike the unconditional iterate all
events thing though. Maybe we can set some per-cpu state indicating
someone got throttled (rare under normal operation -- you'd hope) and
only iterate to unthrottle when we find this set.

I think the event scheduling resulting from migration will already
re-enable the event, avoiding the loss of unthrottle due to that..
although it would be good to verify that.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/