Re: [PATCH 0/6 v8] sched/fair: Add push task mechanism and handle more EAS cases

From: Qais Yousef

Date: Tue Mar 10 2026 - 00:16:57 EST

On 02/26/26 18:34, Pierre Gondois wrote:
>
> On 12/2/25 19:12, Vincent Guittot wrote:
> > This is a subset of [1] (sched/fair: Rework EAS to handle more cases)
> >
> > [1] https://lore.kernel.org/all/20250314163614.1356125-1-vincent.guittot@xxxxxxxxxx/
> >
> > The current Energy Aware Scheduler has some known limitations which have
> > became more and more visible with features like uclamp as an example. This
> > serie tries to fix some of those issues:
> > - tasks stacked on the same CPU of a PD
> > - tasks stuck on the wrong CPU.
>
> Following some other comments I think, I'm not sure I understand the use
> case
> the patchset tries to solve.
> - If this is for UCLAMP_MAX tasks:
> As Christian said (somwhere) the utilization of a long running task doesn't
> represent anything, so using EAS to do task placement cannot give a good
> placement. The push mechanism effectively allows to down-migrate UCLAMP_MAX
> tasks, but the repartition of these tasks is then subject to randomness.

Why randomness? We should distribute within the same perf domain, no?

>
> On a Radxa Orion:
> - 12 CPUs
> - CPU[1-4] are little CPUs with capa=290
> - using an artificial EM
>
> Running 8 CPU-bound tasks with UCLAMP_MAX=100, the task placement can be:
> - CPU1: 6 tasks
> - CPU2: 1 task
> - CPU3: 1 task
> - CPU4: idle
> The push mechanism triggers feec() and down-migrate tasks to little CPUs.
> However doesn't balance the ratio of (load / capacity) between CPUs as the
> load balancer could do. So the above placement is correct in that regard.

Hmm. Energy should tell us which perf domain is cheaper. But within the same
perf domain we pick the CPU with the most spare capacity.

Do all the CPUs appear loaded with max_spare_cap = 0?

Worth noting as part of looking at enabling overloaded support, it is important
to look at nr_running which I think something we should look at as we evolve
this handling. But for now, I think max_spare_cap checks should distribute
within a perf domain. nr_running will handle this more gracefully which is
trivial to add later for feec(). But ideally we want all wake up code to look
at nr_running and I think better defer it to after initial merge.