Re: [RFC][PATCH v021 4/9] sched/topology: Adjust cpufreq checks for EAS
From: Christian Loehle
Date: Wed Dec 11 2024 - 05:33:58 EST
On 11/29/24 16:00, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Make it possible to use EAS with cpufreq drivers that implement the
> :setpolicy() callback instead of using generic cpufreq governors.
>
> This is going to be necessary for using EAS with intel_pstate in its
> default configuration.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
>
> This is the minimum of what's needed, but I'd really prefer to move
> the cpufreq vs EAS checks into cpufreq because messing around cpufreq
> internals in topology.c feels like a butcher shop kind of exercise.
Makes sense, something like cpufreq_eas_capable().
>
> Besides, as I said before, I remain unconvinced about the usefulness
> of these checks at all. Yes, one is supposed to get the best results
> from EAS when running schedutil, but what if they just want to try
> something else with EAS? What if they can get better results with
> that other thing, surprisingly enough?
How do you imagine this to work then?
I assume we don't make any 'resulting-OPP-guesses' like
sugov_effective_cpu_perf() for any of the setpolicy governors.
Neither for dbs and I guess userspace.
What about standard powersave and performance?
Do we just have a cpufreq callback to ask which OPP to use for
the energy calculation? Assume lowest/highest?
(I don't think there is hardware where lowest/highest makes a
difference, so maybe not bothering with the complexity could
be an option, too.)
>
> ---
> kernel/sched/topology.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> Index: linux-pm/kernel/sched/topology.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/topology.c
> +++ linux-pm/kernel/sched/topology.c
> @@ -217,6 +217,7 @@ static bool sched_is_eas_possible(const
> bool any_asym_capacity = false;
> struct cpufreq_policy *policy;
> struct cpufreq_governor *gov;
> + bool cpufreq_ok;
> int i;
>
> /* EAS is enabled for asymmetric CPU capacity topologies. */
> @@ -251,7 +252,7 @@ static bool sched_is_eas_possible(const
> return false;
> }
>
> - /* Do not attempt EAS if schedutil is not being used. */
> + /* Do not attempt EAS if cpufreq is not configured adequately */
> for_each_cpu(i, cpu_mask) {
> policy = cpufreq_cpu_get(i);
> if (!policy) {
> @@ -261,11 +262,14 @@ static bool sched_is_eas_possible(const
> }
> return false;
> }
> + /* Require schedutil or a "setpolicy" driver */
> gov = policy->governor;
> + cpufreq_ok = gov == &schedutil_gov ||
> + (!gov && policy->policy != CPUFREQ_POLICY_UNKNOWN);
> cpufreq_cpu_put(policy);
> - if (gov != &schedutil_gov) {
> + if (!cpufreq_ok) {
> if (sched_debug()) {
> - pr_info("rd %*pbl: Checking EAS, schedutil is mandatory\n",
> + pr_info("rd %*pbl: Checking EAS, unsuitable cpufreq governor\n",
> cpumask_pr_args(cpu_mask));
> }
> return false;
The logic here looks fine to me FWIW.