Re: [PATCH 0/6 v8] sched/fair: Add push task mechanism and handle more EAS cases

From: Dietmar Eggemann
Date: Wed Dec 10 2025 - 08:32:34 EST


- hongyan.xia2@xxxxxxx
- luis.machado@xxxxxxx

On 02.12.25 19:12, Vincent Guittot wrote:
> This is a subset of [1] (sched/fair: Rework EAS to handle more cases)
>
> [1] https://lore.kernel.org/all/20250314163614.1356125-1-vincent.guittot@xxxxxxxxxx/
>
> The current Energy Aware Scheduler has some known limitations which have
> became more and more visible with features like uclamp as an example. This
> serie tries to fix some of those issues:
> - tasks stacked on the same CPU of a PD
> - tasks stuck on the wrong CPU.
>
> Patch 1 fixes the case where a CPU is wrongly classified as overloaded
> whereas it is capped to a lower compute capacity. This wrong classification
> can prevent periodic load balancer to select a group_misfit_task CPU
> because group_overloaded has higher priority.
>
> Patch 2 removes the need of testing uclamp_min in cpu_overutilized to
> trigger the active migration of a task on another CPU.
>
> Patch 3 prepares select_task_rq_fair() to be called without TTWU, Fork or
> Exec flags when we just want to look for a possible better CPU.
>
> Patch 4 adds push call back mecanism to fair scheduler but doesn't enable
> it.
>
> Patch 5 enable has_idle_core for !SMP system to track if there may be an
> idle CPU in the LLC.
>
> Patch 6 adds some conditions to enable pushing runnable tasks for EAS:
> - when a task is stuck on a CPU and the system is not overutilized.
> - if there is a possible idle CPU when the system is overutilized.
>
> More tests results will come later as I wanted to send the pachtset before
> LPC.
>
> I have kept Tbench figures as I added them in v7 but results are the same
> with the correct patch 6.
>
> Tbench on dragonboard rb5
> schedutil and EAS enabled
>
> # process tip +patchset
> 1 29.3(+/-0.3%) 29.2(+/-0.2%) +0%
> 2 61.1(+/-1.8%) 61.7(+/-3.2%) +1%
> 4 260.0(+/-1.7%) 258.8(+/-2.8%) -1%
> 8 1361.2(+/-3.1%) 1377.1(+/-1.9%) +1%
> 16 981.5(+/-0.6%) 958.0(+/-1.7%) -2%
>
> Hackbench didn't show any difference

I guess this is the overall idea here is:

-->

(1) Push runnable tasks

[pick_next|put_prev]_task_fair() -> fair_add_pushable_task() ->
fair_push_task() (*)

__set_next_task_fair() -> fair_queue_pushable_tasks() ->
queue_balance_callback(..., push_fair_tasks)

push_fair_task() -> strf(), move_queued_task() (or similar)

(2) Push single running task

tick() -> check_pushable_task() -> fair_push_task() (*), strf(),
active_balance

<--

strf() ... select_task_rq_fair(..., 0)

(1) & (2) are invoked when the policy fair_push_task() (2 parts
according to OverUtilized (OU) scenario) says the task should be moved

fair_push_task() (*)

sched_energy_push_task() - non-OU

sched_idle_push_task() - OU


Pretty complex to reason about where this could be beneficial. I'm
thinking about the interaction of (1) and (2) with wakeup & MF handling
in non-OU and with load-balance in in OU.

You mentioned that you will show more test results next to tbench soon.
I don't know right now how to interpret the tbench results above.

IMHO, a set of rt-app files (customisable to a specific asymmetric CPU
capacity systems, potentially with uclamp max settings) with scenarios
to provoke the new functionality would help with the
understanding/evaluating here.