Re: [PATCH] perf/core: introduce context per CPU event list

From: Peter Zijlstra
Date: Thu Nov 10 2016 - 07:13:31 EST


On Thu, Nov 10, 2016 at 12:04:23PM +0000, Mark Rutland wrote:
> On Thu, Nov 10, 2016 at 12:37:05PM +0100, Peter Zijlstra wrote:

> > So the problem is finding which events are active when.
>
> Sure.
>
> If we only care about PERF_EVENT_STATE_ACTIVE, then I think we can
> fairly easily maintain a perf_event_context::active_event_list at
> event_sched_{in,out}() time (or somewhere close to that).
>
> If we need PERF_EVENT_STATE_INACTIVE events, then that doesn't work,
> since we can give up early and not schedule some eligible events.
>
> > If we stick all events in an RB-tree sorted on: {pmu,cpu,runtime} we
> > can, fairly easily, find the relevant subtree and limit the iteration.
> > Esp. if we use a threaded tree.
>
> That would cater for big.LITTLE, certainly, but I'm not sure I follow
> how that helps to find active events -- you'll still have to iterate
> through the whole PMU subtree to find which are active, no?

Ah, so the tree would in fact only contain 'INACTIVE' events :-)

That is, when no events are on the hardware, all events (if there are
any) are INACTIVE.

Then on sched-in, we find the relevant subtree, and linearly try and
program all events from that subtree onto the PMU. Once adding an event
fails programming, we stop (like we do now).

These programmed events transition from INACTIVE to ACTIVE, and we take
them out of the tree.

Then on sched-out, we remove all events from the hardware, increase the
events their runtime value by however long they were ACTIVE, flip them
to INACTIVE and stuff them back in the tree.

(I'm can't quite recall if we can easily find ACTIVE events from a PMU,
but if not, we can easily track those on a separate list).