Re: [PATCH v9] sched/fair: Filter false overloaded_group case for EAS

From: Qais Yousef

Date: Mon Feb 16 2026 - 20:04:36 EST

On 02/12/26 09:55, Christian Loehle wrote:
> On 2/11/26 01:48, Qais Yousef wrote:
> > On 02/06/26 10:54, Vincent Guittot wrote:
> >> With EAS, a group should be set overloaded if at least 1 CPU in the group
> >> is overutilized but it can happen that a CPU is fully utilized by tasks
> >> because of clamping the compute capacity of the CPU. In such case, the CPU
> >> is not overutilized and as a result should not be set overloaded as well.
> >>
> >> group_overloaded being a higher priority than group_misfit, such group can
> >> be selected as the busiest group instead of a group with a mistfit task
> >> and prevents load_balance to select the CPU with the misfit task to pull
> >> the latter on a fitting CPU.
> >>
> >> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> >> Tested-by: Pierre Gondois <pierre.gondois@xxxxxxx>
> >> ---
> >>
> >> This patch was part of a larger patchset [1] but makes sense on its own and has
> >> not changed since v2
> >>
> >> [1] https://lore.kernel.org/all/20251202181242.1536213-1-vincent.guittot@xxxxxxxxxx/
> >
> > I don't mind this. But I think with the original series misfit will be handled
> > better with push lb, and if it is made to handle overloaded case (which my
> > initial testing shows it is easily doable and I can't see clear bad impact
> > yet), I think we can retire overutilized altogether.
> >
>
> The EAS wakeup path (and therefore the push lb for that matter) is costly and workloads
> are sensitive to it, it's trivial to see with hackbench. Overutilized prevents that.

What workloads? I have been testing this and all I am seeing are great results
so far.

Hackbench is a super synthetic test that doesn't represent any real workload.
It purely measures context switch overhead. I think I said this before, but
I'll repeat it again. For most modern systems and workloads we really need to
spend more time to make sure we do the correct task placement decision as the
cost of a wrong fast decision is worse than a slow correct one. And this is not
something special about mobile systems. Servers and others do care. For those
who really don't want any additional overhead they can just disable the static
key.

FWIW I tried schbench, which is more realistic since it does something that
represents a web server, and it measures throughput and latencies and I got 10%
better throughput, 27% better P99 and 49% better max latencies. And yes, OU is
completely disabled when I ran this test.

But disclaimer again, I backported earlier (modified) version of the patch and
running on non-mainline kernel with OOT changes applied that I think helps to
demonstrate the benefit even better.

Vincent, I am trying to stress the importance of the work and its great
potential. I am not expecting the initial merge to handle everything yet ;-)

> Arguments about PELT inaccuracies during periods of unmet compute demand (and therefore
> entirely bogus EAS computation results) aside, I don't see how we a push lb could retire
> OU? If anything you're paying twice the price then for these scenarios?

I am not seeing any price to be paid. Geekbench scores are within run-to-run
variation.