Re:Re: Question about "Make sched entity usage tracking scale-invariant"

From: Chao Xie
Date: Tue May 26 2015 - 21:51:44 EST



At 2015-05-26 19:05:36, "Morten Rasmussen" <morten.rasmussen@xxxxxxx> wrote:
>Hi,
>
>[Adding maintainers and others to cc]
>
>On Mon, May 25, 2015 at 02:19:43AM +0100, Chao Xie wrote:
>> hi
>> I saw the patch “sched: Make sched entity usage tracking
>> scale-invariant” that will make the usage to be freq scaled.
>> So if delta period that the calculation of usage based on cross a
>> frequency change, so how can you make sure the usage calculation is
>> correct?
>> The delta period may last hundreds of microseconds, and frequency
>> change window may be 10-20 microseconds, so many frequency change can
>> happen during the delta period.
>> It seems the patch does not consider about it, and it just pick up the
>> current one.
>> So how can you resolve this issue?
>
>Right. We don't know how many times the frequency may have changed since
>last time we updated the entity usage tracking for the particular
>entity. All we do is to call arch_scale_freq_capacity() and use that
>scaling factor to compensate for whatever changes might have taken
>place.
>
>The easiest implementation of arch_scale_freq_capacity() for most
>architectures is to just return a scaling factor computed based on the
>current frequency and ignoring when exactly the change happened and
>ignoring if multiple changes happened. Depending on how often the
>frequency might change this might be an acceptable approximation. While
>the task is running the sched tick will update the entity usage tracking
>(every 10ms by default on most ARM systems), hence we shouldn't be more
>than a tick off in term of when the frequency change is accounted for.
>Under normal circumstances the delta period should therefore be <10ms
>and generally shorter than that if you have more than one task runnable
>on the cpu or the task(s) are not always-running. It is not perfect but
>it is a lot better than the utilization tracking currently used by
>cpufreq governors and better than the scheduler being completely unaware
>of frequency scaling.
>
>For systems with very frequent frequency changes, i.e. fast hardware and
>an aggressive governor leading to multiple changes in less than 10ms,
>the solution above might not be sufficient. In that case, I think a
>better solution might be to track the average frequency using hardware
>counters or whatever tracking metrics the system might have to let
>arch_scale_freq_capacity() return the average performance delivered over
>the most recent period of time. AFAIK, x86 already has performance
>counters (APERF/MPERF) that could be used for this purpose. The delta
>period for each entity tracking update isn't fixed, but it might
>sufficient to just average over some fixed period of time. Accurate
>tracking would require some time-stamp information to be stored in each
>sched_entity for the true average to be computed for the delta period.
>That quickly becomes rather messy but not impossible. I did look at it
>briefly a while back, but decided not to go down that route until we
>know that using current frequency or some fixed period average isn't
>going to be sufficient. Usage or utilization is and average of something
>that might be constantly changing anyways, so it never going to be very
>accurate anyway. If it does turn out that we can't get the overall
>picture right, we will need to improve it.
>
>Updating the entity tracking for each frequency change adds to much
>overhead I think and seems unnecessary if we do with an average scaling
>factor.
>
>I hope that answers your question. Have you observed any problems with
>the usage tracking?
>

Thanks for the explanation.

I agree that the "delta" is less than 10ms at most situation, but i think at least one period
need to be considered.
If the frequency change happens just a little, for example, 10us before the task start to
calculate its utilization which may have a delta of 10ms. The almost whole delta will be calculated
based on new frequency, not the old one. The frequency change can be from the lowest to highest,
so for this time the delta calculation has big deviation, and this situation is not rare.

>Thanks,
>Morten