Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient

From: Joel Fernandes
Date: Tue Jul 11 2017 - 01:12:22 EST


Hi Juri,

On Mon, Jul 10, 2017 at 3:55 AM, Juri Lelli <juri.lelli@xxxxxxx> wrote:
> Hi Joel,
>
> On 09/07/17 10:08, Joel Fernandes wrote:
>> Currently the iowait_boost feature in schedutil makes the frequency go to max.
>> This feature was added to handle a case that Peter described where the
>> throughput of operations involving continuous I/O requests [1] is reduced due
>> to running at a lower frequency, however the lower throughput itself causes
>> utilization to be low and hence causing frequency to be low hence its "stuck".
>>
>> Instead of going to max, its also possible to achieve the same effect by
>> ramping up to max if there are repeated in_iowait wakeups happening. This patch
>> is an attempt to do that. We start from a lower frequency (iowait_boost_min)
>> and double the boost for every consecutive iowait update until we reach the
>> maximum iowait boost frequency (iowait_boost_max).
>>
>> I ran a synthetic test on an x86 machine with intel_pstate in passive mode
>> using schedutil. The patch achieves the desired effect as the existing
>> behavior. Also tested on ARM64 platform and see that there the transient iowait
>> requests aren't causing frequency spikes.
>>
>> [1] https://patchwork.kernel.org/patch/9735885/
>>
>> Cc: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
>> Cc: Len Brown <lenb@xxxxxxxxxx>
>> Cc: Rafael J. Wysocki <rjw@xxxxxxxxxxxxx>
>> Cc: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Signed-off-by: Joel Fernandes <joelaf@xxxxxxxxxx>
[..]
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index 622eed1b7658..4d9e8b96bed1 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -53,7 +53,9 @@ struct sugov_cpu {
>> struct update_util_data update_util;
>> struct sugov_policy *sg_policy;
>>
>> + bool prev_iowait_boost;
>> unsigned long iowait_boost;
>> + unsigned long iowait_boost_min;
>> unsigned long iowait_boost_max;
>> u64 last_update;
>>
>> @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, unsigned long *max)
>> *max = cfs_max;
>> }
>>
>> +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu)
>> +{
>> + sg_cpu->iowait_boost >>= 1;
>> +
>> + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min)
>> + sg_cpu->iowait_boost = 0;
>> +}
>> +
>> static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
>> unsigned int flags)
>> {
>> if (flags & SCHED_CPUFREQ_IOWAIT) {
>> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
>> + /* Remember for next time that we did an iowait boost */
>> + sg_cpu->prev_iowait_boost = true;
>> + if (sg_cpu->iowait_boost) {
>> + sg_cpu->iowait_boost <<= 1;
>> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost,
>> + sg_cpu->iowait_boost_max);
>> + } else {
>> + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min;
>> + }
>> } else if (sg_cpu->iowait_boost) {
>> s64 delta_ns = time - sg_cpu->last_update;
>>
>> /* Clear iowait_boost if the CPU apprears to have been idle. */
>> if (delta_ns > TICK_NSEC)
>> sg_cpu->iowait_boost = 0;
>
> In this case we might just also want to reset prev_iowait_boost
> unconditionally and return.

Ok, will do.

>> +
>> + /*
>> + * Since we don't decay iowait_boost when its consumed during
>> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now.
>> + */
>
> Code below seems pretty self-explaning to me. :)
>

Ok :)

>> + if (sg_cpu->prev_iowait_boost) {
>> + sugov_decay_iowait_boost(sg_cpu);
>> + sg_cpu->prev_iowait_boost = false;
>> + }
>> }
>> }
>>
>> static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
>> - unsigned long *max)
>> + unsigned long *max, unsigned int flags)
>> {
>> unsigned long boost_util = sg_cpu->iowait_boost;
>> unsigned long boost_max = sg_cpu->iowait_boost_max;
>> @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
>> *util = boost_util;
>> *max = boost_max;
>> }
>> - sg_cpu->iowait_boost >>= 1;
>> +
>> + /*
>> + * Incase iowait boost just happened on this CPU, don't reduce it right
>> + * away since then the iowait boost will never increase on subsequent
>> + * in_iowait wakeups.
>> + */
>> + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(&sugov_cpu) == sg_cpu)
>> + return;
>> +
>> + sugov_decay_iowait_boost(sg_cpu);
>
> Mmm, do we need to decay even when we are computing frequency requests
> for a domain?
>
> Considering it a per-cpu thing, isn't enough that it gets bumped up or
> decayed only when a CPU does an update (by using the above from
> sugov_update_shared)?
>
> If we go this way I think we will only need to reset prev_iowait_boost
> if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_
> freq_shared().
>

Actually the "decay" was already being done before (without this
patch), I am just preserving the same old behavior where we do decay.
Perhaps your proposal can be a separate match? Or did I miss something
else subtle here?

Thanks,

-Joel