Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
From: Dietmar Eggemann
Date: Mon Aug 18 2025 - 11:28:05 EST
On 18.08.25 12:05, Xuewen Yan wrote:
> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> <dietmar.eggemann@xxxxxxx> wrote:
>>
>> On 14.08.25 10:52, Xuewen Yan wrote:
>>> Hi Dietmar,
>>>
>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
>>> <dietmar.eggemann@xxxxxxx> wrote:
>>>>
>>>> On 12.08.25 10:33, Xuewen Yan wrote:
>>
>> [...]
>>
>>>> Can you not mask cpus already early in the pd loop (1) and then profit
>>>> from (2) in these rare cases?
>>>
>>> I do not think the cpus_ptr chould place before the pd_cap calc,
>>> because the following scenario should be considered:
>>> the task's cpus_ptr cpus: 0,1,2,3
>>> pd's cpus: 0,1,2,3,4,5,6
>>> the pd's cap = cpu_cap * 6;
>>> if we cpumask_and(pd'scpus, p->cpus_ptr),
>>> the cpumask_weight = 4,
>>> the pd's cap = cpu_cap *4.
>>
>> Yes, you're right! Missed this one.
>>
>>>> IIRC, the sd only plays a role here in
>>>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
>>>
>>> I am also wondering if the check for SD's CPUs could be removed...
>>
>> Still not 100% sure here. I would have to play with cpusets and EAS a
>> little bit more. Are you thinking that in those cases p->cpus_ptr
>> already covers the cpuset restriction so that the sd mask isn't necessary?
>
> I am not familiar with cpuset, so I can't guarantee this. Similarly, I
> also need to learn more about cpuset and cpu topology before I can
> answer this question.
Looks like we do need also the sd cpumask here.
Consider this tri-gear system:
# cat /sys/devices/system/cpu/cpu*/cpu_capacity
160
160
160
160
498
498
1024
1024
and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
possible in all 3 root_domains (/, /cs1, /cs2):
...
[ 74.691104] CPU1 attaching sched-domain(s):
[ 74.691180] domain-0: span=0-1 level=MC
[ 74.691244] groups: 1:{ span=1 cap=159 }, 0:{ span=0 cap=155 }
[ 74.693453] domain-1: span=0-1,4,6 level=PKG
[ 74.693534] groups: 0:{ span=0-1 cap=314 }, 4:{ span=4 cap=496 },
6:{ span=6 cap=986 }
...
[ 74.697890] root domain span: 0-1,4,6
[ 74.697994] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[ 74.698922] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
sd = rcu_dereference(*this_cpu_ptr(&sd_asym_cpucapacity));
Tasks running in '/' only have the sd to reduce the CPU affinity correctly.
...
[001] 5290.935663: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
[001] 5290.935696: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=6-7 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935753: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=4-5 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935779: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=0-3 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
...