Re: [PATCH RFC v5] cpufreq: schedutil: Make iowait boost more energy efficient

From: Joel Fernandes
Date: Wed Jul 19 2017 - 00:39:37 EST


Hi Viresh,

I appreciate the discussion.

On Mon, Jul 17, 2017 at 10:45 PM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> On 17-07-17, 10:35, Joel Fernandes wrote:
>> On Mon, Jul 17, 2017 at 1:04 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>> > On 16-07-17, 01:04, Joel Fernandes wrote:
>
>> >> + if (sg_cpu->iowait_boost_pending) {
>> >> + sg_cpu->iowait_boost_pending = false;
>> >> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1,
>> >> + sg_cpu->iowait_boost_max);
>> >
>> > Now this has a problem. We will also boost after waiting for
>
> s/also/always/
>
>> > rate_limit_us. And that's why I had proposed the tricky solution in
>>
>> Not really unless rate_limit_us is < TICK_NSEC? Once TICK_NSEC
>> elapses, we would clear the boost in sugov_set_iowait_boost and in
>> sugov_next_freq_shared.
>
> You misread it and I know why it happened. And so I have sent a small
> patch to make it a bit more readable.
>
> rate_limit_us is associated with "last_freq_update_time", while
> iowait-boost is associated with "last_update".
>
> And last_update gets updated way too often.
>
>> > the first place. I thought we wanted to avoid instant boost only for
>> > the first iteration, but after that we wanted to do it ASAP. Isn't it?
>> >
>> > Now that you are using policy->min instead of policy->cur, we can
>> > simplify the solution I proposed and always do 2 * iowait_boost before
>>
>> No, doubling on the first boost was never discussed or intended in my
>> earlier patches. I thought even your patch never did, you were
>> dividing by 2, and then scaling it back up by 2 before consuming it to
>> preserve the initial boost.
>>
>> > getting current util/max in above if loop. i.e. we will start iowait
>> > boost with min * 2 instead of min and that should be fine.
>>
>> Hmm, but why start from double of min? Why not just min? It doesn't
>> make any difference to the intended behavior itself and is also
>> consistent with my proposal in RFC v4. Also I feel what you're
>> suggesting is more spike prone as well, the idea was to start from the
>> minimum and double it as we go, not to double the min the first go.
>> That was never intended.
>>
>> Also I would rather keep the "set and use and set and use" pattern to
>> keep the logic less confusing and clean IMO.
>> So we set initial boost in sugov_set_iowait_boost, and then in
>> sugov_iowait_boost we use it, and then set the boost for the next time
>> around at the end of sugov_iowait_boost (that is we double it). Next
>> time sugov_set_iowait_boost wouldn't touch the boost whether iowait
>> flag is set or not and we would continue into sugov_iowait_boost to
>> consume the boost. This would have a small delay in reducing the
>> boost, but that's Ok since its only one cycle of delay, and keeps the
>> code clean. I assume the last part is not an issue considering you're
>> proposing double of the initial boost anyway ;-)
>
> Okay, let me try to explain the problem first and then you can propose
> a solution if required.
>
> Expected Behavior:
>
> (Window refers to a time window of rate_limit_us here)
>
> A. The first window where IOWAIT flag is set, we set boost to min-freq
> and that shall be used for next freq update in
> sugov_iowait_boost(). Any more calls to sugov_set_iowait_boost()
> within this window shouldn't change the behavior.
>
> B. If the next window also has IOWAIT flag set, then
> sugov_iowait_boost() should use iowait*2 for freq update.
>
> C. If a window doesn't have IOWAIT flag set, then sugov_iowait_boost()
> should use iowait/2 in it.
>
>
> Do they look fine to you?
>
> Now coming to how will system behave with your patch:
>
> A. would be fine. We will follow things properly.
>
> But B. and C. aren't true anymore.
>
> This happened because after the first window we updated iowait_boost
> as 2*min unconditionally and the next window will *always* use that,
> even if the flag isn't set. And we may end up increasing the frequency
> unnecessarily, i.e. the spike where this discussion started.

Not really, to me B will still work because in the case the flag is
set, we are correctly double boosting in the next cycle.

Taking an example, with B = flag is set and D = flag is not set

F = Fmin (minimum)

iowait flag B B B D D D
resulting boost F 2*F 4*F 4*F 2*F F

What will not work is C but as I mentioned in my last email, that
would cause us to delay the iowait boost halving by at most 1 cycle,
is that really an issue considering we are starting from min compared
to max? Note that cases A. and B. are still working.

Considering the following cases:
(1) min freq is 800MHz, it takes upto 4 cycles to reach 4GHz where the
flag is set. At this point I think its likely we will run for many
more cycles which means keeping the boost active for 1 extra cycle
isn't that big a deal. Even if run for just 5 cycles with boost, that
means only the last cycle will suffer from C not decaying as soon as
possible. Comparing that to the case where in current code we
currently run at max from the first cycle, its not that bad.

(2) we have a transient type of thing, in this case we're probably not
reaching the full max immediately so even if we delay the decaying,
its still not as bad as what it is currently.

I think considering that the code is way cleaner than any other
approach - its a small price to pay. Also keeping in mind that this
patch is still an improvement over the current spike, even though as
you said its still a bit spikey, but its still better right?

Functionally the code is working and I think is also clean, but if you
feel that its still confusing, then I'm open to rewriting it.

>
> And so in my initial solution I reversed the order in
> sugov_iowait_boost().

Yes, but to fix A. you had to divide by 2 in sugov_set_iowait_boost,
and then multiply by 2 later in sugov_iowait_boost to keep the first
boost at min. That IMO was confusing so my modified patch did it
differently.

thanks,

-Joel