Re: [PATCH] sched/fair: Revert boost in cpu_util()

From: Qais Yousef

Date: Mon May 18 2026 - 21:19:26 EST

On 05/18/26 11:37, hongyan.xia(夏弘彦) wrote:
> On 5/18/2026 6:04 PM, Christian Loehle wrote:
> > [Some people who received this message don't often get email from christian.loehle@xxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > On 5/18/26 03:40, hongyan.xia(夏弘彦) wrote:
> >> From: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
> >>
> >> We have seen a massive power consumption regression (20% SoC power
> >> increase in many apps) after updating our kernel. After bisection we
> >> pinpointed the regression to the cpu_util(boost) feature. After
> >> reverting the boost feature the massive energy regression is gone.
> >> Detailed trace analysis down below. The regression is found across quite
> >> many apps but Youtube is one of the worst offenders, shown in the
> >> 1080p60fps video benchmark:
> >>
> >> Setup FPS SoC Power (mW) diff
> >> w/ boost 59.94 913.6
> >> w/o boost 59.93 720.4 -21.15%
> >>
> >> Signed-off-by: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
> >>
> >> ---
> >> Analysis:
> >>
> >> We found several problems that result in the power spike:
> >>
> >> 1. Arithmetic should not happen between util_avg and runnable_avg:
> >>
> >> After util = max(util, runnable) which potentially picks runnable value
> >> in cpu_util(), we then add or subtract task util values from it. This
> >> produces a value that is half-runnable-half-util which is ill-defined.
> >> This alone should be a warning sign. This breaks EAS calculations in
> >> many cases, leading to sub-optimal task placements.

I don't think it does. The util signal itself has issues too :)

> >>
> >> 2. Using the absolute value of runnable_avg to drive frequency is
> >> too high to be reasonable:
> >>
> >> We use runnable in a _relative_ way to util to know whether there is
> >> contention in several places. However, the _absolute_ value should not
> >> be used like util. Runnable_avg tends to be significantly higher,
> >> making it much easier to saturate frequency.
> >>
> >> For example, if three tasks each with a util of 100 contend on the same
> >> rq, the rq util is 300 but runnable_avg shoots up to 900. 900 drives the
> >> CPU at the max frequency, and it's highly questionable whether this
> >> boost is the right decision.

I think this is the idea. These tasks are waiting behind other tasks.

> >>
> >> 3. Runnable_avg may not even reflect true contention:
> >>
> >> When tasks are dependent, the bottleneck is often the data flow between
> >> tasks, not the contention seen by runnable_avg. Boosting frequency with
> >> runnable in such scenarios wastes power without performance benefits.

I believe contention is used to describe several tasks fighting for CPU time
but only a single task can run and the other will be waiting. But I think
I know what you mean, I think this is the same I was highlighting in [1].
We don't care if some tasks end up waiting for more.

> >>
> >> We found 1 has minor power regression but 2 and 3 regresses power
> >> significantly. We have seen multiple applications with the
> >> producer-consumer model with many worker threads suffer. When there is
> >> IPC between producer and consumer, boosting frequency blindly does not
> >> help performance at all if consumer is limited by how much data is flown
> >> through. Youtube suffer from 1, 2 and 3 at the same time, leading to a
> >> total SoC power regression of 20% shown in the results above.
> >
> > We did discuss removing runnable boost internally as well, but I’d love to see
> > more data too.
> > The original issue it was trying to solve was avoiding jank frames during load
> > spikes, which YouTube does not really exercise. Some gaming workload data would
> > therefore be a useful addition here.
>
> Although I would be glad to provide more data (after more benchmarks and
> pending our internal approval), I wonder, what level of performance gain
> do we expect from this feature to justify the big energy regression?
>
> > Runnable boost was considered as an alternative to approaches like reducing the
> > PELT half-life and similar changes. Qais’ current ideas also try to tackle this
> > problem, of course, so +CC.

A lot of the current behavior is actually good for power by accident. And this
runnable approach helps performance as a workaround to these issues. We need to
defer some decisions to userspace and just give them a better way to decide
their trade-offs. One person's regression is another person's gain..

> >
> > If you have run many workloads, do you also have data on where this feature actually
> > helped, especially in reducing jank frames?
>
> We ran our Day of Use (DoU, including Facebook, Youtube and other
> popular apps) test model and we did see a 6.6% increase in jank frames
> after the revert. Dropped frames went up from 106 to 113 in a total of
> 70210 frames. However, in our test model there is no way an increase of
> 7 frames within 70210 justifies the energy regression between 10% and
> 20% in a lot of apps, hence for us the trade-off decision is very clear
> here.
>
> Another question from me is, if this feature has potentially buggy
> corners or mathematical unsoundness (mostly the half-util-half-runnable
> value inside cpu_util()), should we rely on its performance gain?
>
> >
> > Some discussion from back then:
> > https://lore.kernel.org/lkml/20230406155030.1989554-1-dietmar.eggemann@xxxxxxx/
> > https://lore.kernel.org/lkml/20220829055450.1703092-1-dietmar.eggemann@xxxxxxx/

Generally I remember I had concerns on this approach then [1]. I kept quite
after it got merged and won't complain if it is removed now.

[1] https://lore.kernel.org/lkml/20230504152328.twh3rqgq2o2gvd4u@airbuntu/