On 19/10/2023 18:05, Mathieu Desnoyers wrote:
+static unsigned long scale_rt_capacity(int cpu);
+
+/*
+ * Returns true if adding the task utilization to the estimated
+ * utilization of the runnable tasks on @cpu does not exceed the
+ * capacity of @cpu.
+ *
+ * This considers only the utilization of _runnable_ tasks on the @cpu
+ * runqueue, excluding blocked and sleeping tasks. This is achieved by
+ * using the runqueue util_est.enqueued.
+ */
+static inline bool task_fits_remaining_cpu_capacity(unsigned long task_util,
+ int cpu)
This is almost like the existing task_fits_cpu(p, cpu) (used in Capacity
Aware Scheduling (CAS) for Asymmetric CPU capacity systems) except the
latter only uses `util = task_util_est(p)` and deals with uclamp as well
and only tests whether p could fit on the CPU.
Or like find_energy_efficient_cpu() (feec(), used in
Energy-Aware-Scheduling (EAS)) which uses cpu_util(cpu, p, cpu, 0) to get:
max(util_avg(CPU + p), util_est(CPU + p))
feec()
...
for (; pd; pd = pd->next)
...
util = cpu_util(cpu, p, cpu, 0);
...
fits = util_fits_cpu(util, util_min, util_max, cpu)
^^^^^^^^^^^^^^^^^^
not used when uclamp is not active (1)
...
capacity = capacity_of(cpu)
fits = fits_capacity(util, capacity)
if (!uclamp_is_used()) (1)
return fits
So not introducing new functions like task_fits_remaining_cpu_capacity()
in this area and using existing one would be good.
+{
+ unsigned long total_util;
+
+ if (!sched_util_fits_capacity_active())
+ return false;
+ total_util = READ_ONCE(cpu_rq(cpu)->cfs.avg.util_est.enqueued) + task_util;
+ return fits_capacity(total_util, scale_rt_capacity(cpu));
Why not use:
static unsigned long capacity_of(int cpu)
return cpu_rq(cpu)->cpu_capacity;
which is maintained in update_cpu_capacity() as scale_rt_capacity(cpu)?
[...]
@@ -7173,7 +7200,8 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
if (recent_used_cpu != prev &&
recent_used_cpu != target &&
cpus_share_cache(recent_used_cpu, target) &&
- (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
+ (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu) ||
+ task_fits_remaining_cpu_capacity(task_util, recent_used_cpu)) &&
cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) &&
asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
return recent_used_cpu;
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index ee7f23c76bd3..9a84a1401123 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -97,6 +97,12 @@ SCHED_FEAT(WA_BIAS, true)
SCHED_FEAT(UTIL_EST, true)
SCHED_FEAT(UTIL_EST_FASTUP, true)
IMHO, asymmetric CPU capacity systems would have to disable the sched
feature UTIL_FITS_CAPACITY. Otherwise CAS could deliver different
results. task_fits_remaining_cpu_capacity() and asym_fits_cpu() work
slightly different.
[...]