Re: [PATCH v2 0/1] arm: topology: parse the topology from the dt

From: Dietmar Eggemann
Date: Mon Apr 19 2021 - 17:27:55 EST


On 19/04/2021 04:55, Ruifeng Zhang wrote:
> Dietmar Eggemann <dietmar.eggemann@xxxxxxx> 于2021年4月17日周六 上午1:00写道:
>>
>> On 16/04/2021 13:04, Ruifeng Zhang wrote:
>>> Dietmar Eggemann <dietmar.eggemann@xxxxxxx> 于2021年4月16日周五 下午6:39写道:
>>>>
>>>> On 16/04/2021 11:32, Valentin Schneider wrote:
>>>>> On 16/04/21 15:47, Ruifeng Zhang wrote:

[...]

>> I'm afraid that this is now a much weaker case to get this into
>> mainline.
>
> But it's still a problem and it's not break the original logic ( parse
> topology from MPIDR or parse capacity ), only add the support for
> parse topology from DT.
> I think it should still be merged into the mainline. If don't, the
> DynamIQ SoC has some issue in sched and cpufreq.

IMHO, not necessarily. Your DynamIQ SoC is one cluster with 8 CPUs. It's
subdivided into 2 Frequency Domains (FDs).

CFS Energy-Aware-Scheduling (EAS, find_energy_efficient_cpu()) and
Capacity-Aware-Scheduling (CAS, select_idle_sibling() ->
select_idle_capacity()) work correctly even in case you only have an MC
sched domain (sd).
No matter which sd (MC, DIE) the sd_asym_cpucapacity is, we always
iterate over all CPUs. Per Performance Domains (i.e. FDs) in EAS and
over sched_domain_span(sd) in CAS.

CFS load-balancing (in case your system is `over-utilized`) might work
slightly different due to the missing DIE sd but not inevitably worse.

Do you have benchmarks or testcases in mind which convince you that
Phantom Domains is something you would need? BTW, they are called
Phantom since they let you use uarch and/or max CPU frequency domain to
fake real topology (like LLC) boundaries.

[...]

> Why do you keep the logic of topology_parse_cpu_capacity in arm
> get_coretype_capacity function? The capacity-dmips-mhz will be parsed
> by drivers/base/arch_topology.c as following:
> parse_dt_topology
> parse_cluster
> parse_core
> get_cpu_for_node
> topology_parse_cpu_capacity

I think we still need it for systems out there w/o cpu-map in dt, like
my arm32 TC2 with mainline vexpress-v2p-ca15_a7.dts.

It's called twice on each CPU in case I add the cpu-map dt entry though.