Re: [RFC PATCH] arm64: dts: allwinner: a64/h5: Add CPU idle states

From: Samuel Holland
Date: Wed Mar 24 2021 - 00:45:38 EST


On 3/22/21 8:56 PM, Andre Przywara wrote:
>> I'm sending this patch as an RFC because it raises questions about how
>> we handle firmware versioning. How far back does (or should) our support
>> for old TF-A and Crust versions go?
>>
>> cpuidle has a problem that without working firmware support, CPUs will
>> enter idle states and be unable to wake up. As a result, the system will
>> hang at some point during boot, usually before getting to userspace.
>>
>> For over a year[0], TF-A has exposed the PSCI CPU_SUSPEND function when
>> a SCPI implementation is present[1]. Implementing CPU_SUSPEND is
>> required for implementing SYSTEM_SUSPEND[2], even if CPU_SUSPEND is not
>> itself used for anything.
>>
>> However, there was no code to actually wake up a CPU once it called the
>> CPU_SUSPEND function, because I could not find the register providing
>> the necessary information. The fact that CPU_SUSPEND was broken affected
>> nobody, because nothing ever called it -- there were no idle states in
>> the DTS. In hindsight, what I should have done was always return failure
>> from sunxi_validate_power_state(), but that ship has long sailed.
>>
>> I finally found the elusive register and implemented the wakeup code
>> earlier this month[3]. So now, CPU_SUSPEND actually works, if all of
>> your firmware is up to date, and cpuidle works if you add the states in
>> your device tree.
>>
>> Unfortunately, there is currently nothing verifying that compatibility.
>> So you can get into four possible scenarios:
>> 1) No idle states in DTS, any firmware => Linux works, with baseline
>> power consumption.
>> 2) Idle states added to DTS, no Crust/SCPI => Linux works, but every
>> attempt to enter an idle state is rejected because CPU_SUSPEND is
>> not hooked up. So power consumption increases by a sizable amount.
>> 3) Idle states added to DTS, "old" Crust/SCPI (before [3]) => Linux
>> fails to boot, because CPUs never return from idle states.
>> 4) Idle states added to DTS, "new" Crust/SCPI (after [3]) => Linux
>> works, with improved power consumption compared to the baseline.
>>
>> Obviously, we want to prevent scenario 3 if possible.
>
> So I think the core of the problem is that the DT describes some
> firmware feature, but we have the DT bundled with the kernel, not the
> firmware.

I would say the core problem is that the firmware lies about supporting
PSCI CPU_SUSPEND. Linux shouldn't be calling CPU_SUSPEND if the firmware
declares it as unavailable, regardless of what is in the DTS.
(Technically, per the PSCI standard, CPU_SUSPEND is a mandatory
function, but a quick survey of the TF-A platforms shows it is far from
universally implemented.)

> So is there any way we can detect an older crust version in U-Boot,
> then remove any potential idle states from the DT?

Let's assume that we are using a functioning SoC (H3) or the secure fuse
is blown (A64) and therefore U-Boot cannot access SRAM A2. I can think
of three ways it can learn about crust:

a) PSCI_FEATURES (e.g. is CPU_SUSPEND supported)
b) Metadata in the FIT image
c) Custom SMCs

TF-A has some additional methods available:

d) The SCPI-reported firmware version
e) The magic number at the beginning of the firmware binary

> Granted, this requires recent U-Boot as well, but at least we could try
> to mitigate the worst case a bit?

If we're okay with modifying firmware to solve this problem, then I
propose the following solution:

1) Version bump crust or change its magic number.
2) Modify TF-A to only report CPU_SUSPEND as available if it detects the
new crust version. This would involve conditionally setting
sunxi_scpi_psci_ops.validate_power_state, and updating psci_setup.c
to also check for .validate_power_state when setting psci_caps.
3) Modify the Linux PSCI client to respect PSCI_FEATURES when setting
psci_ops.cpu_suspend. cpuidle-psci checks for this function before
setting up idle states.
4) Finally, after some time, add the idle states to the DTS.

In fact, this solution solves both scenarios 2 and 3, because it also
takes care of the native PM implementation, which doesn't implement
CPU_SUSPEND at all.

Does that sound workable?

Regards,
Samuel

> A better solution could be to only *add* the idle states if the rest of
> the firmware is deemed worthy. So the mainline DTs would not carry the
> properties in the first place, and only U-Boot adds them, on detecting
> a capable firmware?
> Admittedly this changes the "flow" of the DT, where the kernel is the
> authority, but it might help to solve this problem?
>
> Or any other way, which involves U-Boot patching the DTB? (This would
> apply to the DTB passed to the kernel, regardless of where and when
> it's loaded from)
>
> Any opinions?
>
> Cheers,
> Andre
>
>> Enter the current patch: I chose the arm,psci-suspend-param values
>> specifically so they would be _rejected_ by the current TF-A code. This
>> makes scenario 3 behave like scenario 2. I then have some follow-up TF-A
>> patches (not yet submitted) to switch to the new parameter encoding[4].
>>
>> This brings me back to my original question. Once the TF-A patches in
>> [4] are merged, scenario 3 (with an updated TF-A but an old Crust) would
>> fail to boot again. Do we care?
>>
>> Should I implement some kind of runtime version checking, so TF-A can
>> disable CPU_SUSPEND if it would be broken? Or instead, should we wait
>> some amount of time to merge this patch (or the patches at [4]) and
>> assume people have upgraded?
>>
>> Where would people expect this sort of possibly-breaking change to be
>> documented?
>>
>> Separately, since I assume most A64/H5 users (outside of LibreELEC and
>> the PinePhone) are not using Crust, scenario 2 would be very common. If
>> merging this patch increases their idle power draw by 500 mW, is that an
>> acceptable cost for decreasing other users' idle power draw by 50 mW?
>>
>> Sorry for the wall of text,
>> Samuel
>>
>> [0]: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/plat/allwinner/common/sunxi_pm.c?id=e382c88e2a26995099bb931d49e754dcaebc5593
>> [1]: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/plat/allwinner/common/sunxi_scpi_pm.c?id=2e0e51f42586826a1f6f6c1e532f90e6df642cf5#n190
>> [2]: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/lib/psci/psci_setup.c?id=2e0e51f42586826a1f6f6c1e532f90e6df642cf5#n251
>> [3]: https://github.com/crust-firmware/crust/commits/85944467c804
>> [4]: https://github.com/crust-firmware/arm-trusted-firmware/commits/d6ebf5dab2da
>>
>> ---
>>
>> arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 26 +++++++++++++++++++
>> arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi | 26 +++++++++++++++++++
>> 2 files changed, 52 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>> index 57786fc120c3..2b1b5b36098c 100644
>> --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
>> @@ -54,6 +54,7 @@ cpu0: cpu@0 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-names = "cpu";
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> };
>>
>> cpu1: cpu@1 {
>> @@ -65,6 +66,7 @@ cpu1: cpu@1 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-names = "cpu";
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> };
>>
>> cpu2: cpu@2 {
>> @@ -76,6 +78,7 @@ cpu2: cpu@2 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-names = "cpu";
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> };
>>
>> cpu3: cpu@3 {
>> @@ -87,6 +90,29 @@ cpu3: cpu@3 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-names = "cpu";
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> + };
>> +
>> + idle-states {
>> + entry-method = "psci";
>> +
>> + cpu_sleep: cpu-sleep {
>> + compatible = "arm,idle-state";
>> + local-timer-stop;
>> + entry-latency-us = <800>;
>> + exit-latency-us = <1500>;
>> + min-residency-us = <25000>;
>> + arm,psci-suspend-param = <0x00010003>;
>> + };
>> +
>> + cluster_sleep: cluster-sleep {
>> + compatible = "arm,idle-state";
>> + local-timer-stop;
>> + entry-latency-us = <850>;
>> + exit-latency-us = <1500>;
>> + min-residency-us = <50000>;
>> + arm,psci-suspend-param = <0x01010013>;
>> + };
>> };
>>
>> L2: l2-cache {
>> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi
>> index 578a63dedf46..1c416f648c58 100644
>> --- a/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi
>> +++ b/arch/arm64/boot/dts/allwinner/sun50i-h5.dtsi
>> @@ -18,6 +18,7 @@ cpu0: cpu@0 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-latency-ns = <244144>; /* 8 32k periods */
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> };
>>
>> cpu1: cpu@1 {
>> @@ -28,6 +29,7 @@ cpu1: cpu@1 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-latency-ns = <244144>; /* 8 32k periods */
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> };
>>
>> cpu2: cpu@2 {
>> @@ -38,6 +40,7 @@ cpu2: cpu@2 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-latency-ns = <244144>; /* 8 32k periods */
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> };
>>
>> cpu3: cpu@3 {
>> @@ -48,6 +51,29 @@ cpu3: cpu@3 {
>> clocks = <&ccu CLK_CPUX>;
>> clock-latency-ns = <244144>; /* 8 32k periods */
>> #cooling-cells = <2>;
>> + cpu-idle-states = <&cpu_sleep>, <&cluster_sleep>;
>> + };
>> +
>> + idle-states {
>> + entry-method = "psci";
>> +
>> + cpu_sleep: cpu-sleep {
>> + compatible = "arm,idle-state";
>> + local-timer-stop;
>> + entry-latency-us = <800>;
>> + exit-latency-us = <1500>;
>> + min-residency-us = <25000>;
>> + arm,psci-suspend-param = <0x00010003>;
>> + };
>> +
>> + cluster_sleep: cluster-sleep {
>> + compatible = "arm,idle-state";
>> + local-timer-stop;
>> + entry-latency-us = <850>;
>> + exit-latency-us = <1500>;
>> + min-residency-us = <50000>;
>> + arm,psci-suspend-param = <0x01010013>;
>> + };
>> };
>> };
>>
>