Re: [PATCH 5/7 v2] sched/fair: Add push task callback for EAS

From: Pierre Gondois
Date: Thu Jan 16 2025 - 12:34:54 EST


Hello Vincent,

On 12/17/24 17:07, Vincent Guittot wrote:
EAS is based on wakeup events to efficiently place tasks on the system, but
there are cases where a task will not have wakeup events anymore or at a
far too low pace. For such situation, we can take advantage of the task
being put back in the enqueued list to check if it should be migrated on
another CPU.

Wake up events remain the main way to migrate tasks but we now detect
situation where a task is stuck on a CPU by checking that its utilization
is larger than the max available compute capacity (max cpu capacity or
uclamp max setting)

It seems there are 2 distinct cases:
a- The task is alone on a rq
b- The task shares the rq and is enqueued/dequeued

a. doesn't seem to need any of the push functions, and b. doesn't seem to
need any of the misfit functions. Maybe it's worth splitting the patch in 2.


Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
kernel/sched/fair.c | 206 +++++++++++++++++++++++++++++++++++++++++++
kernel/sched/sched.h | 2 +
2 files changed, 208 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cd046e8216a9..2affc063da55 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7088,6 +7088,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
hrtick_update(rq);
}
+static void dequeue_pushable_task(struct rq *rq, struct task_struct *p);
static void set_next_buddy(struct sched_entity *se);
/*
@@ -7118,6 +7119,9 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
h_nr_idle = task_has_idle_policy(p);
if (task_sleep || task_delayed || !se->sched_delayed)
h_nr_runnable = 1;
+
+ if (task_sleep || task_on_rq_migrating(p))
+ dequeue_pushable_task(rq, p);
} else {
cfs_rq = group_cfs_rq(se);
slice = cfs_rq_min_slice(cfs_rq);
@@ -8617,6 +8621,182 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
return target;
}
+static inline bool task_misfit_cpu(struct task_struct *p, int cpu)
+{
+ unsigned long max_capa = get_actual_cpu_capacity(cpu);
+ unsigned long util = task_util_est(p);
+
+ max_capa = min(max_capa, uclamp_eff_value(p, UCLAMP_MAX));
+ util = max(util, task_runnable(p));
+
+ /*
+ * Return true only if the task might not sleep/wakeup because of a low
+ * compute capacity. Tasks, which wake up regularly, will be handled by
+ * feec().
+ */

NIT:
On a little CPU with min_OPP=256 and max_OPP=512,
a task with a util=100 and U_Max=10 will trigger this condition.
However:
- the task is already well placed from a power PoV
- the tasks has opportunities to sleep/wake-up
Shouldn't we ideally take:

unsigned long max_capa;
max_capa = max(min_capa(cpu), uclamp_eff_value(p, UCLAMP_MAX));
max_capa = min(get_actual_cpu_capacity(cpu), max_capa);

with min_capa(cpu) returning 256 in this case, i.e. the CPU capacity at the
lowest OPP ?

+ return (util > max_capa);
+}
+

[...]