Re: [PATCH v4 3/3] cpufreq: Return failure if fast_switch is not set and fast_switch_possible is set

From: Rafael J. Wysocki
Date: Wed May 24 2023 - 13:45:49 EST


On Wed, May 17, 2023 at 6:30 PM Wyes Karny <wyes.karny@xxxxxxx> wrote:
>
> If fast_switch_possible flag is set by the scaling driver, the governor
> is free to select fast_switch function even if adjust_perf is set. Some
> scaling drivers which use adjust_perf don't set fast_switch thinking
> that the governor would never fall back to fast_switch. But the governor
> can fall back to fast_switch even in runtime if frequency invariance is
> disabled due to some reason. This could crash the kernel if the driver
> didn't set the fast_switch function pointer.
>
> Therefore, return failure in cpufreq_online function if fast_switch is
> not set and fast_switch_possible is set.
>
> Signed-off-by: Wyes Karny <wyes.karny@xxxxxxx>
> ---
> drivers/cpufreq/cpufreq.c | 5 +++++
> include/linux/cpufreq.h | 4 +++-
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 6b52ebe5a890..7835ba4fa34c 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1376,6 +1376,11 @@ static int cpufreq_online(unsigned int cpu)
> goto out_free_policy;
> }
>
> + if (policy->fast_switch_possible && !cpufreq_driver->fast_switch) {
> + pr_err("fast_switch_possible is enabled but fast_switch callback is not set\n");
> + ret = -EINVAL;
> + goto out_destroy_policy;
> + }

The driver registration can fail if the driver has ->adjust_perf
without ->fast_switch. Then the check above would not be necessary
any more.

> /*
> * The initialization has succeeded and the policy is online.
> * If there is a problem with its frequency table, take it
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 26e2eb399484..8cdf77bb3bc1 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -340,7 +340,9 @@ struct cpufreq_driver {
> /*
> * ->fast_switch() replacement for drivers that use an internal
> * representation of performance levels and can pass hints other than
> - * the target performance level to the hardware.
> + * the target performance level to the hardware. If driver is setting this,
> + * then it needs to set fast_switch also. Because in certain scenario scale
> + * invariance could be disabled and governor can switch back to fast_switch.

I would say something like "This can only be set if ->fast_switch is
set too, because in those cases (under specific conditions) scale
invariance can be disabled, which causes the schedutil governor to
fall back to the latter."

> */
> void (*adjust_perf)(unsigned int cpu,
> unsigned long min_perf,
> --
> 2.34.1
>