Re: [PATCH 4/5] cpufreq: schedutil: map raw required frequency to CPU-supported frequency

From: Rafael J. Wysocki
Date: Thu May 19 2016 - 17:07:47 EST


On Thu, May 19, 2016 at 9:35 PM, Steve Muckle <steve.muckle@xxxxxxxxxx> wrote:
> On Thu, May 19, 2016 at 01:37:40AM +0200, Rafael J. Wysocki wrote:
>> On Mon, May 9, 2016 at 11:20 PM, Steve Muckle <steve.muckle@xxxxxxxxxx> wrote:
>> > The mechanisms for remote CPU updates and slow-path frequency
>> > transitions are relatively expensive - the former is an IPI while the
>> > latter requires waking up a thread to do work. These activities should
>> > be avoided if they are not necessary. To that end, calculate the
>> > actual target-supported frequency required by the new utilization
>> > value in schedutil. If it is the same as the previously requested
>> > frequency then there is no need to continue with the update.
>>
>> Unless the max/min limits changed in the meantime, right?
>
> Right, I'll amend the commit text. The functionality is correct AFAICS.
>
>> >
>> > Signed-off-by: Steve Muckle <smuckle@xxxxxxxxxx>
>> > ---
>> > kernel/sched/cpufreq_schedutil.c | 14 +++++++++++++-
>> > 1 file changed, 13 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> > index 6cb2ecc204ec..e185075fcb5c 100644
>> > --- a/kernel/sched/cpufreq_schedutil.c
>> > +++ b/kernel/sched/cpufreq_schedutil.c
>> > @@ -153,14 +153,26 @@ static void sugov_update_commit(struct sugov_cpu *sg_cpu, int cpu, u64 time,
>> > * next_freq = C * curr_freq * util_raw / max
>> > *
>> > * Take C = 1.25 for the frequency tipping point at (util / max) = 0.8.
>> > + *
>> > + * The lowest target-supported frequency which is equal or greater than the raw
>> > + * next_freq (as calculated above) is returned, or the CPU's max_freq if such
>> > + * a target-supported frequency does not exist.
>> > */
>> > static unsigned int get_next_freq(struct cpufreq_policy *policy,
>> > unsigned long util, unsigned long max)
>> > {
>> > + struct cpufreq_frequency_table *entry;
>> > unsigned int freq = arch_scale_freq_invariant() ?
>> > policy->cpuinfo.max_freq : policy->cur;
>> > + unsigned int target_freq = UINT_MAX;
>> > +
>> > + freq = (freq + (freq >> 2)) * util / max;
>> > +
>> > + cpufreq_for_each_valid_entry(entry, policy->freq_table)
>> > + if (entry->frequency >= freq && entry->frequency < target_freq)
>> > + target_freq = entry->frequency;
>>
>> Please don't assume that every driver will have a frequency table.
>> That may not be the case in the future (and I'm not even sure about
>> the existing CPPC driver for that matter).
>
> For platforms without a frequency table I guess we can just continue
> with the current behavior, passing in the raw calculated frequency. I'll
> make this change.
>
> At some point I imagine those platforms will want to somehow achieve
> similar behavior to avoid very small transitions that do not result in
> real benefit. Maybe some sort of threshold % in the schedutil down the
> road.

So honestly, I'd like to defer this particular optimization for the time being.