Re: [RFC PATCH v3 0/6] sched/cpufreq: Make schedutil energy aware
From: Peter Zijlstra
Date: Thu Oct 17 2019 - 15:07:30 EST
On Thu, Oct 17, 2019 at 03:23:04PM +0100, Douglas Raillard wrote:
> On 10/17/19 10:50 AM, Peter Zijlstra wrote:
> > I'm still thinking about the exact means you're using to raise C; that
> > is, the 'util - util_est' as cost_margin. It hurts my brain still.
>
> util_est is currently the best approximation of the actual portion of the CPU the task needs:
> 1) for periodic tasks, it's not too far from the duty cycle, and is always higher
>
> 2) for aperiodic tasks, it (indirectly) takes into account the total time it took
> to complete the previous activation, so the signal is not 100% composed of logical signals
> only relevant for periodic tasks (although it's a big part of it).
>
> 3) Point 1) and 2) together allows util_est to adapt to periodic tasks that changes
> their duty cycle over time, without needing a very long history (the last task period
> is sufficient).
>
> For periodic tasks, the distance between instantaneous util_avg and the actual task
> duty cycle indicates somehow what is our best guess of the (potential) change in the task
> duty cycle.
>
> util_est is the threshold (assuming util_avg increasing) for util_avg after which we know
> for sure that even if the task stopped right now, its duty cycle would be higher than
> during the previous period.
> This means for a given task and with (util >= util_est):
>
> 1) util - util_est == 0 means the task duty cycle will be equal to the one during
> during the previous activation, if the tasks stopped executing right now.
>
> 2) util - util_est > 0 means the task duty cycle will be higher to the one during
> during the previous activation, if the tasks stopped executing right now.
So far I can follow, 2) is indeed a fairly sane indication that
utilization is growing.
> Using the difference (util - util_est) will therefore give these properties to the boost signal:
> * no boost will be applied as long as the task has a constant or decreasing duty cycle.
>
> * when we can detect that the duty cycle increases, we temporarily increase the frequency.
> We start with a slight increase, and the longer we wait for the current period to finish,
> the more we boost, since the more likely it is that the task has a much larger duty cycle
> than anticipated. More specifically, the evaluation of "how much more" is done the exact
> same way as it is done for PELT, since the dynamic of the boost is "inherited" from PELT.
Right, because as long it keeps running, util_est will not be changed,
so the difference will continue to increase.
What I don't see is how that that difference makes sense as input to:
cost(x) : (1 + x) * cost_j
I suppose that limits the additional OPP to twice the previously
selected cost / efficiency (see the confusion from that other email).
But given that efficency drops (or costs rise) for higher OPPs that
still doesn't really make sense..
> Now if the task is aperiodic, the boost will allow reaching the highest frequency faster,
> which may or may not be desired. Ultimately, it's not more or less wrong than just picking
> the freq based on util_est alone, since util_est is already somewhat meaningless for aperiodic
> tasks. It just allows reaching the max freq at some point without waiting for too long, which is
> all what we can do without more info on the task.
>
> When applying these boosting rules on the runqueue util signals, we are able to detect if at least one
> task needs boosting according to these rules. That only holds as long as the history we look at is
> the result of a stable set of tasks, i.e. no tasks added or removed from the rq.
So while I agree that 2) is a reasonable signal to work from, everything
that comes after is still much confusing me.