On Tue, Oct 29, 2024 at 10:43 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
On some devices there are HW dependencies for shared frequency and voltage
between devices. It will impact Energy Aware Scheduler (EAS) decision,
where CPUs share the voltage & frequency domain with other CPUs or devices
e.g.
- Mid CPUs + Big CPU
- Little CPU + L3 cache in DSU
- some other device + Little CPUs
Detailed explanation of one example:
When the L3 cache frequency is increased, the affected Little CPUs might
run at higher voltage and frequency. That higher voltage causes higher CPU
power and thus more energy is used for running the tasks. This is
important for background running tasks, which try to run on energy
efficient CPUs.
Therefore, add performance state limits which are applied for the device
(in this case CPU). This is important on SoCs with HW dependencies
mentioned above so that the Energy Aware Scheduler (EAS) does not use
performance states outside the valid min-max range for energy calculation.
Signed-off-by: Lukasz Luba <lukasz.luba@xxxxxxx>
---
include/linux/energy_model.h | 24 ++++++++++++++---
kernel/power/energy_model.c | 52 ++++++++++++++++++++++++++++++++++++
2 files changed, 72 insertions(+), 4 deletions(-)
diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index 1ff52020cf757..e83bf230e18d1 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -55,6 +55,8 @@ struct em_perf_table {
* struct em_perf_domain - Performance domain
* @em_table: Pointer to the runtime modifiable em_perf_table
* @nr_perf_states: Number of performance states
+ * @min_ps: Minimum allowed Performance State index
+ * @max_ps: Maximum allowed Performance State index
Any problem with renaming these to min_perf_state and max_perf_state
respectively?
That would improve the code clarity quite a bit IMV.
static inline int
em_pd_get_efficient_state(struct em_perf_state *table, int nr_perf_states,
- unsigned long max_util, unsigned long pd_flags)
+ unsigned long max_util, unsigned long pd_flags,
+ int min_ps, int max_ps)
{
struct em_perf_state *ps;
int i;
- for (i = 0; i < nr_perf_states; i++) {
+ for (i = min_ps; i <= max_ps; i++) {
ps = &table[i];
if (ps->performance >= max_util) {
if (pd_flags & EM_PERF_DOMAIN_SKIP_INEFFICIENCIES &&
@@ -204,7 +213,7 @@ em_pd_get_efficient_state(struct em_perf_state *table, int nr_perf_states,
}
}
- return nr_perf_states - 1;
+ return max_ps;
}
/**
@@ -254,7 +263,8 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
*/
em_table = rcu_dereference(pd->em_table);
i = em_pd_get_efficient_state(em_table->state, pd->nr_perf_states,
- max_util, pd->flags);
+ max_util, pd->flags, pd->min_ps,
+ pd->max_ps);
Couldn't em_pd_get_efficient_state() just take pd as an argument and
dereference it by itself?
The code would be much easier to follow then.