Re: [PATCH RFC 2/5] sched/fair: Skip frequency update if CPU about to idle
From: Joel Fernandes
Date: Mon Oct 30 2017 - 15:03:08 EST
Hi Viresh,
On Mon, Oct 30, 2017 at 5:07 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> On 28-10-17, 02:59, Joel Fernandes wrote:
>> Updating CPU frequency on last dequeue of a CPU is useless. Because the
>> utilization since CPU came out of idle can increase till the last dequeue, this
>> means we are requesting for a higher frequency before entering idle which is
>> not very meaningful or useful. It causes unwanted wakeups of the schedutil
>> governor kthread in slow-switch systems resulting in large number of wake ups
>> that could have been avoided. In an Android application playing music where the
>> music app's thread wakes up and sleeps periodically on an Android device, its
>> seen that the frequency increases slightly on the dequeue and is reduced when
>> the task wakes up again. This oscillation continues between 300Mhz and 350Mhz,
>> and while the task is running, its at 300MHz the whole time. This is pointless.
>> Adding to that, these are unnecessary wake ups. Infact most of the time when
>> the sugov thread wakes up, all the CPUs are idle - so it can hurt power by
>> disturbing the cluster when it is idling.
>>
>> This patch prevents a frequency update on the last dequeue. With this the
>> number of schedutil governor thread wake ups are reduces more than 2 times
>> (1389 -> 527).
>>
>> Cc: Rafael J. Wysocki <rjw@xxxxxxxxxxxxx>
>> Cc: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Signed-off-by: Joel Fernandes <joelaf@xxxxxxxxxx>
>> ---
>> kernel/sched/fair.c | 25 ++++++++++++++++++++++---
>> kernel/sched/sched.h | 1 +
>> 2 files changed, 23 insertions(+), 3 deletions(-)
>
> So you are doing this only for CFS, isn't that required for RT/DL as
> well?
Yes I agree we should. I remember this patch from Patrick doing
something similar for RT:
https://patchwork.kernel.org/patch/9825461/
That patch prevents cpufreq update from dequeue_task_rt path since we
no longer would trigger it from update_curr_rt all the time. Is this
an acceptable solution for RT?
>
> Also, this more looks like a policy decision. Will it be better to
> put that directly into schedutil? Like this:
>
> if (cpu_idle())
> "Don't change the freq";
>
> Will something like that work?
I thought about this and I think it wont work very well. In the
dequeue path we're still running the task being dequeued so the CPU is
not yet idle. What is needed here IMO is a notion that the CPU is
possibly about to idle and we can get predict that from the dequeue
path of the CFS class.
Also just looking at whether the CPU is currently idle or not in the
governor doesn't help to differentiate between say the dequeue path /
tick path. Both of which can occur when the CPU is not idle.
Any thoughts about this?
thanks,
- Joel