Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient

From: Viresh Kumar
Date: Wed Jul 12 2017 - 01:00:50 EST

On 11-07-17, 07:14, Joel Fernandes wrote:
> I think the whole point of IOWAIT boost was to solve the issue with a
> long sequence of repeated I/O requests as described in the commit
> message. So IIUC there isn't a usecase for that (increase freq. on
> first request).

Right. So we can take example that Peter gave earlier. Task runs .1 ms and waits
for IO for 1 ms (at max speed). But there is high possibility that the util
update handler gets called within that 1 ms (from non-enqueue paths) and because
you chose to reduce iowait boost from sugov_set_iowait_boost() in your commit,
we can easily end up ignoring iowait boosting.

> Also its just for the first couple of requests in my
> testing and doesn't hurt the performance at all for the intended
> usecase while still not causing transient spikes.

We can have bad enough timing where the util handler gets called right in that 1
ms of IOWAIT period and we will never boost.

> Another approach than setting min in sugov_set_iowait_boost, is, since
> we have already retrieved the current util, we can check if flags ==
> SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that
> (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or
> iowait_boost_min, which ever is lower.

So my concerns weren't only about the initial min value, but also that you
reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal
value to start with can be.

> This still will not increase
> frequency on the first request, but will ensure the next one will
> benefit.

If there is no non-enqueue path request lands.

> Yes, I've seen that happen in my testing (consecutive iowait).

The CFS scheduler can send a util update request every 1 ms for util updates and
I am not sure why isn't that happening in your case.

How much is the time between two consecutive IOWAIT requests in your case ?
Maybe it is too less (Ofcourse it isn't in your control :). But if we take
Peter's example, then it will surely have a non-enqueue path request between two
IOWAIT requests.

> I
> haven't seen the other case where you have IOWAIT followed by
> non-IOWAIT for a repeated set of IOWAIT requests. Would you more
> comfortable if we moved sugov_set_iowait_boost() after the
> sugov_should_update_freq() ?

That may make us ignore all IOWAIT requests that happen between rate_limit_us
time. And that would be bad IMHO.

> That way if there are consecutive
> requests in the same path, then it most likely rate-limiting will
> prevent such updates. I will also try to collect some stats as you
> suggested to see if how often if at all this can happen.

Completely untested, but what about something like this ? This should get rid of
the spikes you were getting.

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 076a2e31951c..3459f327c94e 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -53,6 +53,7 @@ struct sugov_cpu {
struct update_util_data update_util;
struct sugov_policy *sg_policy;

+ bool iowait_boost_pending;
unsigned long iowait_boost;
unsigned long iowait_boost_max;
u64 last_update;
@@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
unsigned int flags)
- sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
+ sg_cpu->iowait_boost_pending = true;
+ if (!sg_cpu->iowait_boost) {
+ sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->cur;
+ sg_cpu->iowait_boost >>= 1;
+ }
} else if (sg_cpu->iowait_boost) {
s64 delta_ns = time - sg_cpu->last_update;

@@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
unsigned long *max)
- unsigned long boost_util = sg_cpu->iowait_boost;
- unsigned long boost_max = sg_cpu->iowait_boost_max;
+ unsigned long boost_util, boost_max;

- if (!boost_util)
+ if (!sg_cpu->iowait_boost)

+ if (sg_cpu->iowait_boost_pending) {
+ sg_cpu->iowait_boost_pending = false;
+ sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1,
+ sg_cpu->iowait_boost_max);
+ } else {
+ sg_cpu->iowait_boost >>= 1;
+ }
+ boost_util = sg_cpu->iowait_boost;
+ boost_max = sg_cpu->iowait_boost_max;
if (*util * boost_max < *max * boost_util) {
*util = boost_util;
*max = boost_max;
- sg_cpu->iowait_boost >>= 1;