Re: [PATCH RFC 2/5] sched/fair: Skip frequency update if CPU about to idle

From: Joel Fernandes
Date: Wed Nov 08 2017 - 00:11:23 EST


Hi Vincent,

On Mon, Nov 6, 2017 at 12:00 AM, Vincent Guittot
<vincent.guittot@xxxxxxxxxx> wrote:
> Hi Joel,
>
> On 4 November 2017 at 06:44, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>> On Wed, Nov 1, 2017 at 12:35 PM, Steve Muckle <smuckle@xxxxxxxxxx> wrote:
>>> On 10/30/2017 12:02 PM, Joel Fernandes wrote:
>>>>>
>>>>> Also, this more looks like a policy decision. Will it be better to
>>>>> put that directly into schedutil? Like this:
>>>>>
>>>>> if (cpu_idle())
>>>>> "Don't change the freq";
>>>>>
>>>>> Will something like that work?
>>>>
>>>>
>>>> I thought about this and I think it wont work very well. In the
>>>> dequeue path we're still running the task being dequeued so the CPU is
>>>> not yet idle. What is needed here IMO is a notion that the CPU is
>>>> possibly about to idle and we can get predict that from the dequeue
>>>> path of the CFS class.
>>>>
>>>> Also just looking at whether the CPU is currently idle or not in the
>>>> governor doesn't help to differentiate between say the dequeue path /
>>>> tick path. Both of which can occur when the CPU is not idle.
>>>>
>>>> Any thoughts about this?
>>>
>>>
>>> Also if it really is the case that this bit of policy is universally
>>> desirable, I'd think it is better to do this in the scheduler and avoid a
>>> needless trip through a fn pointer out to schedutil for performance reasons.
>>
>> I agree.
>>
>> Peter, what do you think? Are you Ok with the approach of this patch
>> (preventing of the cpufreq update call during dequeue)?
>
> I'm not really convinced that we should do change OPP at dequeue.

You mean "should not do", right?

> Although i agree that it makes perfect sense to prevent increasing OPP
> just before going idle for mp3 playback, i'm not sure that this is
> always the case especially for performance oriented use case because
> we delay the OPP increase and as a result the responsiveness of the
> CPU

Actually I think another way to look at it is this is no worse
performance-wise than if the task was running all the time (didn't
sleep) - the next granularity for increasing the frequency would then
be the next Tick.. So, I am not sure practically it makes any
difference to the performance of the task. Even if sleep was short, we
would update frequency on next enqueue/tick so I think we're still
fine from a time-granularity standpoint.

> In fact this decision really depends about how long we are going to
> sleep. If the cpu will wakes up in few ms, it's worth already
> increasing the frequency when utilization is above the threshold
> because it will not decreases enough to go back to lower OPP. At the
> opposite, if the cpu will not wake up shortly skipping OPP change can
> make sense.
>
> Regarding the reduction of the number of OPP changes, will the
> util_est feature provide the same amount of reduction by directly
> providing the estimated max utilization ?

Yes, I think util_est can help reduce the oscillations which cause
this issue but other than the fact that util_est is not currently
mainlined, I think util_est will still have the same issue if the
util_est's estimation itself oscillates so I think util_est is a
mitigation than a solution. Do you agree?

> Just to say that IMHO skipping or not OPP change at dequeue is a
> policy decision and not a generic one

Yes I agree, but also this means we have to expose scheduler internals
to the governor to detect this case. Are you suggesting if this was to
be implemented - that we pass a flag to governor and make it to do the
decision there based on policy?
As I was discussing in earlier thread with Viresh, simply checking if
CPU is idle in the governor isn't good enough and since this issue is
about the scheduler making cpufreq update call when it didn't need to,
avoiding such a call would make sense to me.

thanks,

- Joel