Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization

From: Peter Zijlstra
Date: Thu May 06 2010 - 11:08:42 EST


On Thu, 2010-05-06 at 16:41 +0200, Stephane Eranian wrote:
> On Thu, May 6, 2010 at 4:20 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Thu, 2010-05-06 at 16:03 +0200, Stephane Eranian wrote:
> >> Hi,
> >>
> >> Looking at ctx_flexible_sched_in(), the logic is that if group_sched_in()
> >> fails for a HW group, then no other HW group in the list is even tried.
> >> I don't understand this restriction. Groups are independent of each other.
> >> The failure of one group should not block others from being scheduled,
> >> otherwise you under-utilize the PMU.
> >>
> >> What is the reason for this restriction? Can we lift it somehow?
> >
> > Sure, but it will make scheduling much more expensive. The current
> > scheme will only ever check the first N events because it stops at the
> > first that fails, and since you can max fix N events on the PMU its
> > constant time.
> >
> You may fail not because the PMU is full but because an event is incompatible
> with the others, i.e., there may still be room for more evens. By relying on the
> RR to get coverage for all events, you also increase blind spots for
> events which
> have been skipped. Longer blind spots implies less accuracy when you scale.
>
> > To fix this issue you'd have to basically always iterate all events and
> > only stop once the PMU is fully booked, which reduces to an O(n) worst
> > case algorithm.
> >
>
> Yes, but if you have X events and you don't know if you have at least N
> that are compatible with each other, then you have to scan the whole list.

I'm not sure why you're arguing, you asked why it did as it did, I gave
an answer ;-)

I agree its not optimal, but fixing it isn't trivial, I would very much
like to avoid a full O(n) loop over all events, esp since creating them
is a non-privilidged operation.

So what we can look at is trying to do better, and making it a service
based scheduler instead of a strict RR should at least get a more equal
distribution.

Another thing we can do is quit at the second or third fail.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/