Re: [PATCH 8/8] sched: prefer cpufreq_scale_freq_capacity

From: Peter Zijlstra
Date: Wed Mar 16 2016 - 03:48:00 EST


On Tue, Mar 15, 2016 at 03:27:21PM -0700, Michael Turquette wrote:

> That solution scales for the case where architectures have different
> methods. It doesn't scale for the case where cpufreq drivers or platform
> code within the same arch have competing implementations.

Sure it does; no matter what interface we use on x86 to set the DVFS
hints (ACPI, intel_p_state, whatever), using APERF/MPERF is the only
actual way of telling WTH the actual frequency was.

> I'm happy with it as a stop-gap, because it will initially work for
> arm{64} and x86, but we'll still need run-time selection of
> arch_scale_freq_capacity some day. Once we have that, I think that we
> should favor a run-time provided implementation over the arch-provided
> one.

Also, I'm thinking we don't need any of this. Your
cpufreq_scale_freq_capacity() is completely and utterly pointless. Since
its implementation simply provides whatever frequency we selected its
identical to not using frequency invariant load metrics and having
cpufreq use the !inv formula.

See:

lkml.kernel.org/r/20160309163930.GP6356@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Now, something else (power aware scheduling etc..) might need the freq
invariant stuff, but cpufreq (which we're concerned with here) does not
unless arch_scale_freq_capacity() does something else than simply return
the value we've set earlier.