Re: [PATCH v2 0/2] sched/fair: Limit access to overutilized

From: Shrikanth Hegde
Date: Wed Feb 28 2024 - 23:51:24 EST




On 2/29/24 5:38 AM, Qais Yousef wrote:
> On 02/28/24 12:46, Shrikanth Hegde wrote:
[...]
>> Overutilized was added for EAS(Energy aware scheduler) to choose either
>> EAS aware load balancing or regular load balance. As checked, on x86 and
>
> It actually toggles load balance on/off (off if !overutilized).
>
> misfit load balance used to be controlled by this but this was decoupled since
> commit e5ed0550c04c ("sched/fair: unlink misfit task from cpu overutilized")
>

Ok.

>> powerpc both overload and overutilized share the same cacheline in rd.
>> Updating overutilized is not required for non-EAS platforms.
>
> Is the fact these two share the cacheline is part of the problem? From patch
> 1 it seems the fact that overutlized is updated often on different cpus is the
> problem? Did you try to move overutlized to different places to see if this
> alternatively helps?
>
> The patches look fine to me. I am just trying to verify that indeed the access
> to overutilzed is the problem, not something else being on the same cacheline
> is accidentally being slowed down, which means the problem can resurface in the
> future.
>

We did explicit cachealign for overload. By doing that newidle_balance goes away from
perf profile. But enqueue_task_fair still remains. That because there is load-store
tearing happening on overutilized field alone due to different CPUs accessing and
updating it at the same time.

We have also verified that rq->rd->overutilized in enqueue_task_fair path is the reason
for it showing up in perf profile.

>>
[...]
>>
>> --
>> 2.39.3
>>