Re: [PATCH v2] sched/topology: remove sysctl_sched_energy_aware depending on the architecture

From: Chen Yu
Date: Thu Sep 07 2023 - 13:58:53 EST


Hi Pierre,

On 2023-09-07 at 12:21:27 +0200, Pierre Gondois wrote:
> Hello Chen,
>
> On 9/1/23 11:49, Chen Yu wrote:
> > Hi Shrikanth,
> >
> > On 2023-09-01 at 12:22:49 +0530, Shrikanth Hegde wrote:
> > > Currently sysctl_sched_energy_aware doesn't alter the said behaviour on
> > > some of the architectures. IIUC its meant to either force rebuild the
> > > perf domains or cleanup the perf domains by echoing 1 or 0 respectively.
> > >
> > > perf domains are not built when there is SMT, or when there is no
> > > Asymmetric CPU topologies or when there is no frequency invariance.
> > > Since such cases EAS is not set and perf domains are not built. By
> > > changing the values of sysctl_sched_energy_aware, its not possible to
> > > force build the perf domains. Hence remove this sysctl on such platforms
> > > that dont support it. Some of the settings can be changed later
> > > such as smt_active by offlining the CPU's, In those cases if
> > > build_perf_domains returns true, re-enable the sysctl.
> > >
> > > Anytime, when sysctl_sched_energy_aware is changed sched_energy_update
> > > is set when building the perf domains. Making use of that to find out if
> > > the change is happening by sysctl or dynamic system change.
> > >
> > > Taking different cases:
> > > Case1. system while booting has EAS capability, sysctl will be set 1. Hence
> > > perf domains will be built if needed. On changing the sysctl to 0, since
> > > sched_energy_update is true, perf domains would be freed and sysctl will
> > > not be removed. later sysctl is changed to 1, enabling the perf domains
> > > rebuild again. Since sysctl is already there, it will skip register.
> > >
> > > Case2. System while booting doesn't have EAS Capability. Later after system
> > > change it becomes capable of EAS. sched_energy_update is false. Though
> > > sysctl is 0, will go ahead and try to enable eas. This is the current
> > > behaviour. if has_eas is true, then sysctl will be registered. After
> > > that any sysctl change is same as Case1.
> > >
> >
> > I think this change makes sense. Just one question for case 2,
> > sched_energy_update is not strictly tied with sysctl change, right?
> > sched_energy_update is true in rebuild_sched_domains_energy().
> > rebuild_sched_domains_energy() will not only be invoked by sysctl
> > path via sched_energy_aware_handler(), but also by other path, such
> > as update_scale_freq_invariant(). If the system boots with EAS capability
> > disabled, then it becomes EAS capable due to the frequency invariant
> > readiness(cpufreq policy change?), then
> > cpufreq_notifier(CPUFREQ_CREATE_POLICY) -> init_amu_fie_callback() ->
> > amu_fie_setup() -> opology_set_scale_freq_source() ->
> > update_scale_freq_invariant(true) -> rebuild_sched_domains_energy()
> > Since sched_energy_update is true, the rebuild of perf domain will be skipped(but
> > actually we want to create it) Please correct me if I miss something.
> >
>
> I thought 'sched_energy_update' was here to force rebuilding the
> perf_domains instead. If sched_energy_update=1, then it prevents from finding
> a pre-existing perf_domain and skipping the perf_domain rebuild, unless I also
> missed something:
>
>
> #if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
> /* Build perf. domains: */
> for (i = 0; i < ndoms_new; i++) {
> for (j = 0; j < n && !sched_energy_update; j++) {
> if (cpumask_equal(doms_new[i], doms_cur[j]) &&
> cpu_rq(cpumask_first(doms_cur[j]))->rd->pd) {
> has_eas = true;
> goto match3;
> }
> }
> /* No match - add perf. domains for a new rd */
> has_eas |= build_perf_domains(doms_new[i]);
> match3:
> ;
> }
> sched_energy_set(has_eas);
> #endif

Yes, it enters build_perf_domains(), in which we have this:

- if (!sysctl_sched_energy_aware)
+ if (!sysctl_sched_energy_aware && sched_energy_update)
goto free;

and domain rebuild will be skipped.

thanks,
Chenyu