Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization

From: Stephane Eranian
Date: Fri May 07 2010 - 06:49:28 EST

Next message: Nils Radtke: "ath5k deadlock w/ blob transfer"
Previous message: Eric B Munson: "[PATCH] Split executable and non-executable mmap tracking"
In reply to: Peter Zijlstra: "Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization"
Next in thread: Peter Zijlstra: "Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, May 7, 2010 at 12:06 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, 2010-05-07 at 11:37 +0200, Stephane Eranian wrote:
>
>> > If we define lag to be the difference between perfect service and our
>> > approximation thereof: lag_i = S - s_i, then for a scheduler to be fair
>> > we must place two conditions thereon:
>> >
>>
>
>> > So eligibility can be expressed as: s_i < avg(s_i).
>> >
>> Which would mean: if my total time on PMU is less than the average
>> time on the PMU for all events thus far, then "schedule me now".
>
Agreed.

>
>> You would have to sort the event by increasing s_i (using the RB tree, I assume)
>
> Exactly.
>
You'd have to insert all event into the tree, read leftmost.
I believe you need more than just basic integer arithmetic
to compare s_i to avg. Or you need to tweak the values
to get more precisions. But you may already be doing that
elsewhere in the kernel.

>> > With this, we will get a schedule like:
>> >
>> > / {A, C}, {B} /
>> >
>> > We are however still fully greedy, which is still O(n), which we don't
>> > want. However if we stop being greedy and use the same heuristic we do
>> > now, stop filling the PMU at the first fail, we'll still be fair,
>> > because the algorithm ensures that.
>> >
>> Let's see if I understand with an example. Assume the PMU multiplex
>> timing is 1ms, 2 counters. s(n) = total time in ms at time n.
>>
>> evt Â A ÂB ÂC
>> s(0) Â0 Â 0 Â0 -> avg = 0/3=0.00, sort = A, B, C, schedule A, fail on B
>> s(1) Â1 Â 0 Â0 -> avg = 1/3=0.33, sort = B, C, A, schedule B, C,
>> s(2) Â1 Â 1 Â1 -> avg = 3/3=1.00, sort = A, B, C, schedule A, fail on B
>> s(3) Â2 Â 1 Â1 -> avg = 4/3=1.33, sort = B, C, A, schedule B, C
>> s(4) Â2 Â 2 Â2 -> avg = 6/3=2.00, sort = A, B, C, schedule A, fail on B
>> s(5) Â3 Â 2 Â2 -> avg = 5/3=1.66, sort = B, C, A, schedule B, C
>>
>> What if there is no constraints on all 3 events?
>>
>> evt Â A ÂB ÂC
>> s(0) Â0 Â 0 Â0 -> avg = 0/3=0.00, sort = A, B, C, schedule A, B
>> s(1) Â1 Â 1 Â0 -> avg = 2/3=0.66, sort = C, A, B, schedule C (A, B > avg)
>> s(2) Â1 Â 1 Â1 -> avg = 3/3=1.00, sort = A, B, C, schedule A, B
>> s(3) Â2 Â 2 Â1 -> avg = 5/3=1.66, sort = C, A, B, schedule C (A, B > avg)
>> s(4) Â2 Â 2 Â2 -> avg = 6/3=2.00, sort = B, C, A, schedule B, C
>> s(5) Â2 Â 3 Â3 -> avg = 8/3=2.66, sort = A, B, C, schedule A (B, C > avg)
>> s(6) Â3 Â 3 Â3 -> avg = 9/3=3.00, sort = A, B, C, schedule A, B
>>
>> When all timings are equal, sort could yield any order, it would not matter
>> because overtime each event will be scheduled if it lags.
>>
>> Am I understanding your algorithm right?
>
> Perfectly!
>
> So the ramification of not using a greedy algorithm is that the
> potential schedule of constrained events/groups gets longer than is
> absolutely required, but I think that is something we'll have to live
> with, since O(n) just isn't a nice option.
>
> This can be illustrated if we consider B to be exclusive with both A and
> C, in that case we could end up with:
>
> / {A}, {B}, {C} /
>
> instead of
>
> / {A, C}, {B} /
>
> Depending on the order in which we find events sorted.

Yes. Not clear how you could avoid this without having a global
view of the dependencies between events (which are really event
groups, BTW). Here you'd need to know that if you have
evt A B C
s(0) 0 0 0 -> avg = 0/3=0.00, sort = A, B, C, schedule A, fail on B
s(1) 1 0 0 -> avg = 1/3=0.33, sort = B, C, A, schedule B, fail on C

You'd have two options:
s(2) 1 1 0 -> avg = 2/3=0.66, sort = C, A, B, schedule A, C
or
s(2) 1 1 0 -> avg = 2/3=0.66, sort = C, B, A schedule C

The first is more attractive because it shortens the blind spots on C.
Both are equally fair, though. Looks like you'd need to add a 2nd
parameter to the sort when s_i are equal. That would have to be
related to the number of constraints. You could sort in reverse order
to the number of constraints, assuming you can express the constraint
as a number. For simple constraints, such as counter restrictions, you
could simply compare the number of possible counters between events.
But there are PMU where there is no counter constraints but events are
incompatible. What values do you compare in this case?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Nils Radtke: "ath5k deadlock w/ blob transfer"
Previous message: Eric B Munson: "[PATCH] Split executable and non-executable mmap tracking"
In reply to: Peter Zijlstra: "Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization"
Next in thread: Peter Zijlstra: "Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]