Re: [PATCH] arm64: cpufeature: Expose the real mpidr value to EL0

From: Robin Murphy
Date: Wed Sep 13 2023 - 07:23:51 EST


On 2023-09-13 10:44, guojinhui wrote:
In EL0, it can get the register midr's value to distinguish vendor.
But it won't return real value of the register mpidr by using mrs
in EL0. The register mpidr's value is useful to obtain the cpu
topology information.

...except there's no guarantee that the MPIDR value is anything other
than a unique identifier. Proper topology information is already exposed
to userspace[1], as described by ACPI PPTT or Devicetree[2]. Userspace
should be using that.

Not to mention that userspace fundamentally can't guarantee it won't be
migrated at just the wrong point and read the MPIDR of a different CPU
anyway. (This is why the MIDRs and REVIDRs are also reported via sysfs,
such that userspace has a stable and reliable source of information in
case it needs to consider potential errata.)

Thanks,
Robin.

[1] https://www.kernel.org/doc/html/latest/admin-guide/cputopology.html
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/cpu/cpu-topology.txt

1. If we can get the infomation of the vendor (by MIDR), i think it possible to obtain
the die infomation from the MPIDR value. Such as the kunpeng-920,
4 cores per cluster, 8 clusters per die, whose MPIDR value is as follow:

```
<DIE>.<CLUSTER>.<CORE>.<HT>

cpu = 0, 81080000
cpu = 1, 81080100
...
cpu = 3, 81080300
cpu = 4, 81090000
...
cpu = 7, 81090300
cpu = 8, 810a0000
...
cpu = 11, 810a0300
cpu = 12, 810b0000
...
cpu = 15, 810b0300
cpu = 16, 810c0000
...
cpu = 19, 810c0300
cpu = 20, 810d0000
...
cpu = 31, 810f0300
cpu = 32, 81180000
...
cpu = 63, 811f0300
```

we can get the die infomation by 0x810, 0x811.

This is very much a platform-specific assumption, though, and once you're assuming enough to be able to derive anything meaningful from a raw MPIDR, you could equally derive the same thing from existing sources like NUMA topology (if you know the SoC, then for sure you can know how nodes relate to dies).

2. we can bind the task to the specific cpu to obtain the MPIDR value.

...unless that CPU then gets offlined, the task is forcibly migrated elsewhere, and ends up obtaining the *wrong* MPIDR value :(

3. I have checked the sysfs interface `/sys/devices/system/cpu/cpuN/topology/*`
in Ampere and kunpeng-920 with the latest linux kernel before i submit the patch,
but it doesn't provide the information of die.

```
# ls /sys/devices/system/cpu/cpu0/topology/
cluster_cpus cluster_cpus_list cluster_id core_cpus core_cpus_list core_id core_siblings core_siblings_list package_cpus package_cpus_list physical_package_id thread_siblings thread_siblings_list
# cat /sys/devices/system/cpu/cpu0/topology/*
00000000,00000000,00000000,00000003
0-1
616
00000000,00000000,00000000,00000001
0
6656
00000000,00000000,ffffffff,ffffffff
0-63
00000000,00000000,ffffffff,ffffffff
0-63
0
00000000,00000000,00000000,00000001
0

# uname -r
6.6.0-rc1
```

Then I check the code which parses the cpu topology infomation from PPTT:

```
int __init parse_acpi_topology(void)
{
int cpu, topology_id;

if (acpi_disabled)
return 0;

for_each_possible_cpu(cpu) {
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0)
return topology_id;

if (acpi_cpu_is_threaded(cpu)) {
cpu_topology[cpu].thread_id = topology_id;
topology_id = find_acpi_cpu_topology(cpu, 1);
cpu_topology[cpu].core_id = topology_id;
} else {
cpu_topology[cpu].thread_id = -1;
cpu_topology[cpu].core_id = topology_id;
}
topology_id = find_acpi_cpu_topology_cluster(cpu);
cpu_topology[cpu].cluster_id = topology_id;
topology_id = find_acpi_cpu_topology_package(cpu);
cpu_topology[cpu].package_id = topology_id;
}

return 0;
}
```

Actually, it just gives the infomation of thread, cluster and package
though the PPTT provides the dies infomation.

May be we can implement some code to obtain die information from PPTT?

I guess if any additional levels of hierarchy exist between the root "package" level and what we infer to be the "cluster" level, then it seems reasonable to me to infer the next level above "package" to be "die". Then it looks like pretty much just a case of wiring up topology_die_id() through the generic topology code.

Thanks,
Robin.