Re: [PATCH v1] arch_topology: Adjust initial CPU capacities with current freq

From: Robin Murphy
Date: Sat Jan 11 2020 - 10:19:49 EST


On 2020-01-11 2:51 am, JeffyChen wrote:
Hi Robin,

Thanks for the clarification :)

On 01/10/2020 08:28 PM, Robin Murphy wrote:
On 2020-01-10 12:01 pm, Dietmar Eggemann wrote:
On 10/01/2020 12:37, Sudeep Holla wrote:
On Thu, Jan 09, 2020 at 03:52:14PM +0800, Jeffy Chen wrote:
The CPU freqs are not supposed to change before cpufreq policies
properly registered, meaning that they should be used to calculate the
initial CPU capacities.

Doing this helps choosing the best CPU during early boot, especially
for the initramfs decompressing.

Signed-off-by: Jeffy Chen <jeffy.chen@xxxxxxxxxxxxxx>

[...]

@@ -146,10 +153,15 @@ bool __init topology_parse_cpu_capacity(struct
device_node *cpu_node, int cpu)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return false;
ÂÂÂÂÂÂÂÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂÂÂ }
-ÂÂÂÂÂÂÂ capacity_scale = max(cpu_capacity, capacity_scale);
ÂÂÂÂÂÂÂÂÂ raw_capacity[cpu] = cpu_capacity;
ÂÂÂÂÂÂÂÂÂ pr_debug("cpu_capacity: %pOF cpu_capacity=%u (raw)\n",
ÂÂÂÂÂÂÂÂÂÂÂÂÂ cpu_node, raw_capacity[cpu]);
+
+ÂÂÂÂÂÂÂ cpu_clk = of_clk_get(cpu_node, 0);
+ÂÂÂÂÂÂÂ if (!PTR_ERR_OR_ZERO(cpu_clk))
+ÂÂÂÂÂÂÂÂÂÂÂ per_cpu(max_freq, cpu) = clk_get_rate(cpu_clk) / 1000;
+
+ÂÂÂÂÂÂÂ clk_put(cpu_clk);

I don't like to assume DVFS to be supplied only using 'clk'. So NACK!
We have other non-clk mechanism for CPU DVFS and this needs to simply
use cpufreq APIs to get frequency value if required.

To support this, it's failing on my Arm64 Juno board.

...
[ÂÂÂ 0.084858] CPU1 cpu_clk=-517
[ÂÂÂ 0.087961] CPU2 cpu_clk=-517
[ÂÂÂ 0.091005] CPU0 cpu_clk=-517
[ÂÂÂ 0.094121] CPU3 cpu_clk=-517
[ÂÂÂ 0.097248] CPU4 cpu_clk=-517
[ÂÂÂ 0.100415] CPU5 cpu_clk=-517

It there any other way to get the initial cpu capacity for this case?

Or can we just assuming all the cores running at the same freq here?

...

Since you're on a big.LITTLE platform, did you specify
'capacity-dmips-mhz' for CPUs to be able to distinguish big and little
CPUs before CPUfreq kicks in?

Indeed, and that's the "problem" - the capacities are there, but with
the broken firmware the kernel starts with the little (boot) cluster
clocked at either 400 or 200MHz, but the big cluster at just 12MHz. At
that speed, a full distro config can take about 3 minutes to get to the
point of loading cpufreq as a module, and I've seen at least one distro
reverting 97df3aa76b4a to 'fix' the symptom :(

Right, for the big cluster, the bootrom(maskrom) will init the clock to 24MHz, and if the bootloader(u-boot for example) doesn't bump it, it would become 12MHz after kernel initialized the whole clk tree.

And in rockchip's BSP 4.4 kernel, there are hacks to bump it to 800MHz(higher freq might require regulator changing) in clk tree initialization, the BSP u-boot also added that recently.

The chromeos's coreboot looks fine, but upstream u-boot seems missing that part too, i'll try to send a patch for that :)

Actually, last time I looked both the BSP U-Boot and mainline do contain equivalent code to initialise both PLLs to (IIRC) 600MHz and apparently adjust a couple of other things set by the maskrom. The trap is that mainline does it in the SPL - thus the unfortunately common combination of using the upstream main stage with the miniloader ends up missing out that step entirely. In comparison, I'm now using the full upstream TPL/SPL flow on my RK3399 board (NanoPC-T4) and even a full generic distro kernel is acceptably quick:

[ 2.315378] Trying to unpack rootfs image as initramfs...
[ 2.781747] Freeing initrd memory: 7316K
...
[ 4.239990] Freeing unused kernel memory: 1984K
[ 4.247829] Run /init as init process

Robin.