Re: [RFC 2/6] perf/core: add a rb-tree index to inactive_groups

From: David Carrillo-Cisneros
Date: Fri Jan 13 2017 - 02:34:51 EST


On Thu, Jan 12, 2017 at 3:47 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> On Tue, Jan 10, 2017 at 12:20:00PM -0800, David Carrillo-Cisneros wrote:
>> On Tue, Jan 10, 2017 at 6:14 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>> > On Tue, Jan 10, 2017 at 02:24:58AM -0800, David Carrillo-Cisneros wrote:
>> > For example, on a big.LITTLE system, big and little CPU PMUs share the
>> > same context, but their events are mutually incompatible. On big CPUs we
>> > only want to consider the sub-tree of big events, and on little CPUs we
>> > only want to consider little events. Hence, we need to be abel to search
>> > by PMU.
>>
>> I see it now. So, if PMU were added to the rb-tree keys. How can the
>> generic code know what's the PMU of the current CPU?
>
> I'm not immediately sure.
>
> We might need to augment struct pmu or perf_event_context with
> information such that we can determine that. That's not something I'd
> considered in great detail, and I'm not sure if peter had something in
> mind.

On second thought, I think the pmu pointer or id can be retrieved from
the event since the tree key only needs to be calculated when an event
is to be inserted.

>
>> > For SW PMUs, pmu::add() should never fail, and regardless of the order
>> > of the list we should be able to pmu::add() all events. Given that, why
>> > does the manner in which rotation occurs matter for SW PMUs?
>> >
>> >> Another complicatino is that using ctx->time (or timestamp) implies that
>> >> groups added during the same context switch may not have unique key.
>> >> This increases the complexity of that finds all events in the rb-tree
>> >> that are within a time interval.
>> >
>> > Could you elaborate on this? I don't understand what the problem is
>> > here. If we need uniqueness where {pmu,cpu,runtime} are equal, can't we
>> > extend the comparison to {pmu,cpu,runtime,event pointer}? That way
>> > everything we need is already implicit in the event, and we don't need
>> > perf_event::rbtree_key nor do we need
>> > perf_event_context::nr_inactive_added.
>>
>> Yes, we could extend the comparison. But I am trying to keep the key a
>> u64 to speed up things.
>>
>> I found it easier to simply create a counter and use it as an equivalent to
>> (timestamp, unique id). Both ways induce the same order of events.
>
> As I mentioned before, I believe that Peter's intent was to consider
> runtime, rather than a last-scheduled timestamp, so I don't think the
> counter is equivalent. It might be that either way is fine; I'll leave
> it to Peter to weigh in.

Also, if there is no real need for an timestamp and or runtime, and it's enough
to keep insertion order (so rotation occurs naturally). Then the rb-tree could
be simplified to only contain either {cpu,pmu,flexible} or {cgroup,flexible} in
its key and each node in the tree would be a sorted list of events.

In this series, the only search operations I do in the rb-tree are
"find first event
and "find last event" in each {cpu,pmu,flexible}/{cgroup,flexible} group. This
could be done faster with a list and we'd save the tree rebalancing.

>
> Do we have any benchmark figures either way?

I haven't measured anything yet.

>
> Thanks,
> Mark.