Re: [Update][PATCH v7 7/7] cpufreq: schedutil: New governor based on scheduler utilization data

From: Peter Zijlstra
Date: Thu Mar 31 2016 - 08:48:55 EST

On Wed, Mar 30, 2016 at 04:00:24AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> Add a new cpufreq scaling governor, called "schedutil", that uses
> scheduler-provided CPU utilization information as input for making
> its decisions.
> Doing that is possible after commit 34e2c555f3e1 (cpufreq: Add
> mechanism for registering utilization update callbacks) that
> introduced cpufreq_update_util() called by the scheduler on
> utilization changes (from CFS) and RT/DL task status updates.
> In particular, CPU frequency scaling decisions may be based on
> the the utilization data passed to cpufreq_update_util() by CFS.
> The new governor is relatively simple.
> The frequency selection formula used by it depends on whether or not
> the utilization is frequency-invariant. In the frequency-invariant
> case the new CPU frequency is given by
> next_freq = 1.25 * max_freq * util / max
> where util and max are the last two arguments of cpufreq_update_util().
> In turn, if util is not frequency-invariant, the maximum frequency in
> the above formula is replaced with the current frequency of the CPU:
> next_freq = 1.25 * curr_freq * util / max
> The coefficient 1.25 corresponds to the frequency tipping point at
> (util / max) = 0.8.
> All of the computations are carried out in the utilization update
> handlers provided by the new governor. One of those handlers is
> used for cpufreq policies shared between multiple CPUs and the other
> one is for policies with one CPU only (and therefore it doesn't need
> to use any extra synchronization means).
> The governor supports fast frequency switching if that is supported
> by the cpufreq driver in use and possible for the given policy.
> In the fast switching case, all operations of the governor take
> place in its utilization update handlers. If fast switching cannot
> be used, the frequency switch operations are carried out with the
> help of a work item which only calls __cpufreq_driver_target()
> (under a mutex) to trigger a frequency update (to a value already
> computed beforehand in one of the utilization update handlers).
> Currently, the governor treats all of the RT and DL tasks as
> "unknown utilization" and sets the frequency to the allowed
> maximum when updated from the RT or DL sched classes. That
> heavy-handed approach should be replaced with something more
> subtle and specifically targeted at RT and DL tasks.
> The governor shares some tunables management code with the
> "ondemand" and "conservative" governors and uses some common
> definitions from cpufreq_governor.h, but apart from that it
> is stand-alone.
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
> drivers/cpufreq/Kconfig | 29 ++
> kernel/sched/Makefile | 1
> kernel/sched/cpufreq_schedutil.c | 528 +++++++++++++++++++++++++++++++++++++++
> kernel/sched/sched.h | 8
> 4 files changed, 566 insertions(+)

I think this is a good first step and we can definitely work from here;
afaict there are no (big) disagreements on the general approach, so

Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>