Re: [PATCH v3] sched/topology: remove sysctl_sched_energy_aware depending on the architecture

From: Shrikanth Hegde
Date: Fri Sep 15 2023 - 14:11:57 EST




On 9/15/23 5:30 PM, Valentin Schneider wrote:
> On 14/09/23 23:26, Shrikanth Hegde wrote:
>> On 9/14/23 9:51 PM, Valentin Schneider wrote:
>>> On 13/09/23 17:18, Shrikanth Hegde wrote:
>>>> sysctl_sched_energy_aware is available for the admin to disable/enable
>>>> energy aware scheduling(EAS). EAS is enabled only if few conditions are
>>>> met by the platform. They are, asymmetric CPU capacity, no SMT,
>>>> valid cpufreq policy, frequency invariant load tracking. It is possible
>>>> platform when booting may not have EAS capability, but can do that after.
>>>> For example, changing/registering the cpufreq policy.
>>>>
>>>> At present, though platform doesn't support EAS, this sysctl is still
>>>> present and it ends up calling rebuild of sched domain on write to 1 and
>>>> NOP when writing to 0. That is confusing and un-necessary.
>>>>
>>>
>>
>> Hi Valentin, Thanks for taking a look at this patch.
>>
>>> But why would you write to it in the first place? Or do you mean to use
>>> this as an indicator for userspace that EAS is supported?
>>>
>>
>> Since this sysctl is present and its value being 1, it gives the
>> impression to the user that EAS is supported when it is not.
>> So its an attempt to correct that part.
>>
>
> Ah, I see. Then how about just making the sysctl return 0 when EAS isn't
> supported? And on top of it, prevent all writes when EAS isn't supported
> (perf domains cannot be built, so there would be no point in forcing a
> rebuild that will do nothing).

Yes. That's another way. Thats what I had as possible approach in
https://lore.kernel.org/lkml/d2c945d6-c4f0-a096-0623-731b11484f51@xxxxxxxxxxxxxxxxxx/



>
> I can never remember how to properly use the sysctl API, so that's a very
> crude implementation, but something like so?
>
> ---
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 05a5bc678c089..dadfc5afc4121 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -230,9 +230,28 @@ static int sched_energy_aware_handler(struct ctl_table *table, int write,
> if (write && !capable(CAP_SYS_ADMIN))
> return -EPERM;
>
> + if (!sched_energy_enabled()) {

Use of sched_energy_enabled won't work as Pierre has indicated.

Instead this can be done by adding those checks in a helper function to
do similar checks as done build_perf_domains.

I can send v4 with this approach if it makes more sense. Please let me know.

> + if (write)
> + return -EOPNOTSUPP;
> + else {
> + size_t len;
> +
> + if (*ppos) {
> + *lenp = 0;
> + return 0;
> + }
> +
> + len = snprintf((char *)buffer, 3, "0\n");
> +
> + *lenp = len;
> + *ppos += len;
> + return 0;
> + }
> + }
> +
> ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> if (!ret && write) {
> - state = static_branch_unlikely(&sched_energy_present);
> + state = sched_energy_enabled();
> if (state != sysctl_sched_energy_aware)
> rebuild_sched_domains_energy();
> }
>