Re: [PATCH] sched/fair: Revert boost in cpu_util()

From: Hongyan Xia

Date: Wed May 27 2026 - 21:42:00 EST


On 5/27/2026 1:16 AM, Dietmar Eggemann wrote:
> On 18.05.26 04:40, hongyan.xia(夏弘彦) wrote:
>> From: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
>>
>> We have seen a massive power consumption regression (20% SoC power
>> increase in many apps) after updating our kernel. After bisection we
>> pinpointed the regression to the cpu_util(boost) feature. After
>> reverting the boost feature the massive energy regression is gone.
>> Detailed trace analysis down below. The regression is found across quite
>> many apps but Youtube is one of the worst offenders, shown in the
>> 1080p60fps video benchmark:
>>
>> Setup FPS SoC Power (mW) diff
>> w/ boost 59.94 913.6
>> w/o boost 59.93 720.4 -21.15%
>>
>> Signed-off-by: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
>
> Looks like I missed running one of the low-power test cases back then.
>
> That said, boosting tasks under contention does not seem like a
> particularly good idea.

Actually it could be desirable to boost under contention, but using the
raw value of runnable_avg might need some re-thinking.

> Jankbench, being primarily a UI rendering benchmark, was probably
> overemphasizing stress on the Android Graphics Pipeline (AGP), where the
> runnable_avg boosting happened to help. Support for Jankbench also
> appears to have stopped with Android 12 — I am not sure why.
>
> At the same time, low-power benchmarks, and even games, do not seem to
> have equally strict requirements for consistently meeting UI rendering
> deadlines.
>
> It is also possible that the AGP itself has evolved since then. I
> vaguely remember a Google Bootcamp presentation discussing the injection
> of performance hints at the beginning of a frame-rendering cycle to
> mitigate early jank, but I can no longer find it. I also do not know how
> that approach relates to the removal of all vendor hooks.
>
> Given all of this, I would also lean toward removing the runnable_avg
> boosting functionality entirely, rather than keeping it behind a sched
> feature flag that defaults to false.
>
>> @@ -8229,16 +8222,10 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>> * Return: (Boosted) (estimated) utilization for the specified CPU.
>
> s/(Boosted) (e/(E/

Ack.

>> */
>> static unsigned long
>> -cpu_util(int cpu, struct task_struct *p, int dst_cpu, int boost)
>> +cpu_util(int cpu, struct task_struct *p, int dst_cpu)
>> {
>> struct cfs_rq *cfs_rq = &cpu_rq(cpu)->cfs;
>> unsigned long util = READ_ONCE(cfs_rq->avg.util_avg);
>> - unsigned long runnable;
>> -
>> - if (boost) {
>> - runnable = READ_ONCE(cfs_rq->avg.runnable_avg);
>> - util = max(util, runnable);
>> - }
>>
>> /*
>> * If @dst_cpu is -1 or @p migrates from @cpu to @dst_cpu remove its