Re: [PATCH v3 07/16] arch_topology: Use the last level cache information from the cacheinfo
From: Dietmar Eggemann
Date: Thu Jun 02 2022 - 10:26:19 EST
On 25/05/2022 10:14, Sudeep Holla wrote:
> The cacheinfo is now initialised early along with the CPU topology
> initialisation. Instead of relying on the LLC ID information parsed
> separately only with ACPI PPTT elsewhere, migrate to use the similar
> information from the cacheinfo.
>
> This is generic for both DT and ACPI systems. The ACPI LLC ID information
> parsed separately can now be removed from arch specific code.
>
> Signed-off-by: Sudeep Holla <sudeep.holla@xxxxxxx>
> ---
> drivers/base/arch_topology.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 765723448b10..4c486e4e6f2f 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -663,7 +663,8 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> /* not numa in package, lets use the package siblings */
> core_mask = &cpu_topology[cpu].core_sibling;
> }
> - if (cpu_topology[cpu].llc_id != -1) {
> +
> + if (last_level_cache_is_valid(cpu)) {
> if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> core_mask = &cpu_topology[cpu].llc_sibling;
> }
> @@ -694,7 +695,7 @@ void update_siblings_masks(unsigned int cpuid)
> for_each_online_cpu(cpu) {
> cpu_topo = &cpu_topology[cpu];
>
> - if (cpu_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id) {
> + if (last_level_cache_is_shared(cpu, cpuid)) {
> cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
> cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
> }
I tested v3 on a Kunpeng920 (w/o CONFIG_NUMA) and it looks
like that last_level_cache_is_shared() isn't working as
expected.
I instrumented cpu_coregroup_mask() like:
const struct cpumask *cpu_coregroup_mask(int cpu)
{
const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
core_mask = &cpu_topology[cpu].core_sibling;
(1)
}
(2)
if (last_level_cache_is_valid(cpu)) {
if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
core_mask = &cpu_topology[cpu].llc_sibling;
(3)
}
if (IS_ENABLED(CONFIG_SCHED_CLUSTER) &&
cpumask_subset(core_mask, &cpu_topology[cpu].cluster_sibling))
core_mask = &cpu_topology[cpu].cluster_sibling;
(4)
(5)
return core_mask;
}
and got:
(A) v3 patch-set:
[ 11.561133] (1) cpu_coregroup_mask[0]=0-47
[ 11.565670] (2) last_level_cache_is_valid(0)=1
[ 11.570587] (3) cpu_coregroup_mask[0]=0 <-- llc_sibling=0 (should be 0-23)
[ 11.574833] (4) cpu_coregroup_mask[0]=0-3 <-- Altra hack kicks in!
[ 11.579275] (5) cpu_coregroup_mask[0]=0-3
# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
DIE
# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -3
cpu0 0
domain0 00000000,00000000,0000000f
domain1 ffffffff,ffffffff,ffffffff
So the MC domain is missing.
(B) mainline as reference (cpu_coregroup_mask() slightly different):
[ 11.585008] (1) cpu_coregroup_mask[0]=0-47
[ 11.589544] (3) cpu_coregroup_mask[0]=0-23 <-- !!!
[ 11.594079] (5) cpu_coregroup_mask[0]=0-23
# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
CLS
MC <-- !!!
DIE
# cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd] | head -4
cpu0 0
domain0 00000000,00000000,0000000f
domain1 00000000,00000000,00ffffff <-- !!!
domain2 ffffffff,ffffffff,ffffffff