Re: [PATCH] arch_topology: support parsing cache topology from DT
From: Dietmar Eggemann
Date: Thu Apr 07 2022 - 16:33:23 EST
On 06/04/2022 11:18, Qing Wang wrote:
> From: wangqing <11112896@xxxxxxxxxx>
[...]
> +void init_cpu_cache_topology(void)
> +{
> + struct device_node *node_cpu, *node_cache;
> + int cpu;
> + int level = 0;
> +
> + for_each_possible_cpu(cpu) {
> + node_cpu = of_get_cpu_node(cpu, NULL);
> + if (!node_cpu)
> + continue;
> +
> + level = 0;
> + node_cache = node_cpu;
> + while (level < MAX_CACHE_LEVEL) {
> + node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
> + if (!node_cache)
> + break;
> +
> + cache_topology[cpu][level++] = node_cache;
> + }
> + of_node_put(node_cpu);
> + }
> +}
>From where is init_cpu_cache_topology() called?
> +bool cpu_share_llc(int cpu1, int cpu2)
> +{
> + int cache_level;
> +
> + for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
> + if (!cache_topology[cpu1][cache_level])
> + continue;
> +
> + if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
> + return true;
> +
> + return false;
> + }
> +
> + return false;
> +}
Like I mentioned in:
https://lkml.kernel.org/r/73b491fe-b5e8-ebca-081e-fa339cc903e1@xxxxxxx
the correct setting in DT's cpu-map node (only core nodes in your case
(One DynamIQ cluster) will give you the correct LLC (highest
SD_SHARE_PKG_RESOURCES) setting.
https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/topology.txt
> +
> +bool cpu_share_l2c(int cpu1, int cpu2)
> +{
> + if (!cache_topology[cpu1][0])
> + return false;
> +
> + if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
> + return true;
> +
> + return false;
> +}
> +
> /*
> * cpu topology table
> */
> @@ -662,7 +720,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> /* not numa in package, lets use the package siblings */
> core_mask = &cpu_topology[cpu].core_sibling;
> }
> - if (cpu_topology[cpu].llc_id != -1) {
> + if (cpu_topology[cpu].llc_id != -1 || cache_topology[cpu][0]) {
> if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> core_mask = &cpu_topology[cpu].llc_sibling;
> }
> @@ -684,7 +742,8 @@ void update_siblings_masks(unsigned int cpuid)
> for_each_online_cpu(cpu) {
> cpu_topo = &cpu_topology[cpu];
>
> - if (cpuid_topo->llc_id == cpu_topo->llc_id) {
> + if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
> + || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {
Assuming a:
.---------------.
CPU |0 1 2 3 4 5 6 7|
+---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
+---------------+
L2 | | | | | | |
+---------------+
L3 |<-- -->|
+---------------+
|<-- cluster -->|
+---------------+
|<-- DSU -->|
'---------------'
system, I guess you would get (w/ Phantom SD and L2/L3 cache info in DT):
CPU0 .. 3:
MC SD_SHARE_PKG_RESOURCES
DIE no SD_SHARE_PKG_RESOURCES
CPU 4...7:
DIE no SD_SHARE_PKG_RESOURCES
I can't see how this would make any sense ...
Reason is cpu_share_llc(). You don't check cache_level=0 and w/
CPU0 .. 3:
cache_topology[CPUX][0] == L2
cache_topology[CPUX][1] == L3
CPU4...7:
cache_topology[CPUX][0] == L3
there is, except for CPU0-1 and CPU2-3, no LLC match.
[...]