Re: [RFC 2/6] perf/core: add a rb-tree index to inactive_groups

From: Mark Rutland
Date: Thu Jan 12 2017 - 06:48:26 EST


On Tue, Jan 10, 2017 at 12:20:00PM -0800, David Carrillo-Cisneros wrote:
> On Tue, Jan 10, 2017 at 6:14 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > On Tue, Jan 10, 2017 at 02:24:58AM -0800, David Carrillo-Cisneros wrote:
> > For example, on a big.LITTLE system, big and little CPU PMUs share the
> > same context, but their events are mutually incompatible. On big CPUs we
> > only want to consider the sub-tree of big events, and on little CPUs we
> > only want to consider little events. Hence, we need to be abel to search
> > by PMU.
>
> I see it now. So, if PMU were added to the rb-tree keys. How can the
> generic code know what's the PMU of the current CPU?

I'm not immediately sure.

We might need to augment struct pmu or perf_event_context with
information such that we can determine that. That's not something I'd
considered in great detail, and I'm not sure if peter had something in
mind.

> > For SW PMUs, pmu::add() should never fail, and regardless of the order
> > of the list we should be able to pmu::add() all events. Given that, why
> > does the manner in which rotation occurs matter for SW PMUs?
> >
> >> Another complicatino is that using ctx->time (or timestamp) implies that
> >> groups added during the same context switch may not have unique key.
> >> This increases the complexity of that finds all events in the rb-tree
> >> that are within a time interval.
> >
> > Could you elaborate on this? I don't understand what the problem is
> > here. If we need uniqueness where {pmu,cpu,runtime} are equal, can't we
> > extend the comparison to {pmu,cpu,runtime,event pointer}? That way
> > everything we need is already implicit in the event, and we don't need
> > perf_event::rbtree_key nor do we need
> > perf_event_context::nr_inactive_added.
>
> Yes, we could extend the comparison. But I am trying to keep the key a
> u64 to speed up things.
>
> I found it easier to simply create a counter and use it as an equivalent to
> (timestamp, unique id). Both ways induce the same order of events.

As I mentioned before, I believe that Peter's intent was to consider
runtime, rather than a last-scheduled timestamp, so I don't think the
counter is equivalent. It might be that either way is fine; I'll leave
it to Peter to weigh in.

Do we have any benchmark figures either way?

Thanks,
Mark.