Re: [PATCH 1/2] arm64: dts: qcom: sm8650: setup cpu thermal with idle on high temperatures

From: Neil Armstrong
Date: Wed Jan 08 2025 - 04:15:47 EST


On 08/01/2025 04:11, Bjorn Andersson wrote:
On Tue, Jan 07, 2025 at 09:13:18AM +0100, Neil Armstrong wrote:
Hi,

On 07/01/2025 00:39, Bjorn Andersson wrote:
On Fri, Jan 03, 2025 at 03:38:26PM +0100, Neil Armstrong wrote:
On the SM8650, the dynamic clock and voltage scaling (DCVS) is done in an
hardware controlled loop using the LMH and EPSS blocks with constraints and
OPPs programmed in the board firmware.

Since the Hardware does a better job at maintaining the CPUs temperature
in an acceptable range by taking in account more parameters like the die
characteristics or other factory fused values, it makes no sense to try
and reproduce a similar set of constraints with the Linux cpufreq thermal
core.

In addition, the tsens IP is responsible for monitoring the temperature
across the SoC and the current settings will heavily trigger the tsens
UP/LOW interrupts if the CPU temperatures reaches the hardware thermal
constraints which are currently defined in the DT. And since the CPUs
are not hooked in the thermal trip points, the potential interrupts and
calculations are a waste of system resources.

Instead, set higher temperatures in the CPU trip points, and hook some CPU
idle injector with a 100% duty cycle at the highest trip point in the case
the hardware DCVS cannot handle the temperature surge, and try our best to
avoid reaching the critical temperature trip point which should trigger an
inevitable thermal shutdown.


Are you able to hit these higher temperatures? Do you have some test
case where the idle-injection shows to be successful in blocking us from
reaching the critical temp?

No, I've been able to test idle-injection and observed a noticeable effect
but I had to set lower trip, do you know how I can easily "block" LMH/EPSS from
scaling down and let the temp go higher ?


I don't know how to override that configuration.


E.g. in X13s (SC8280XP) we opted for relying on LMH/EPSS and define only
the critical trip for when the hardware fails us.

It's the goal here aswell


How about simplifying the patch by removing the idle-injection step and
just rely on LMH/EPSS and the "critical" trip (at least until someone
can prove that there's value in the extra mitigation)?

OK, but I see value in this idle injection mitigation in that case LMH/EPSS
fails, the only factor in control of HLOS is by stopping scheduling tasks
since frequency won't be able to scale anymore.

Anyway, I agree it can be added later on, so should I drop the 2 trip points
and only leave the critical one ?


Regards,
Bjorn



I have no concerns at all about "removing" the 90C trip point, that
makes total sense to me - let the hardware keep the cores as close to
max as possible, and then use some slower sensor for keeping the system
temperature in check (such as the x13s skin sensor).


PS. The described behavior should apply to anything SDM845 and newer, so
I'd like to see this set/document precedence for other platforms.

Regards,
Bjorn

Signed-off-by: Neil Armstrong <neil.armstrong@xxxxxxxxxx>
---
arch/arm64/boot/dts/qcom/sm8650.dtsi | 274 +++++++++++++++++++++++++++--------
1 file changed, 214 insertions(+), 60 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
index 25e47505adcb790d09f1d2726386438487255824..448374a32e07151e35727d92fab77356769aea8a 100644
--- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
@@ -99,6 +99,13 @@ l3_0: l3-cache {
cache-unified;
};
};
+
+ cpu0_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
+
};
cpu1: cpu@100 {
@@ -119,6 +126,12 @@ cpu1: cpu@100 {
qcom,freq-domain = <&cpufreq_hw 0>;
#cooling-cells = <2>;
+
+ cpu1_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu2: cpu@200 {
@@ -146,6 +159,12 @@ l2_200: l2-cache {
cache-unified;
next-level-cache = <&l3_0>;
};
+
+ cpu2_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu3: cpu@300 {
@@ -166,6 +185,12 @@ cpu3: cpu@300 {
qcom,freq-domain = <&cpufreq_hw 3>;
#cooling-cells = <2>;
+
+ cpu3_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu4: cpu@400 {
@@ -193,6 +218,12 @@ l2_400: l2-cache {
cache-unified;
next-level-cache = <&l3_0>;
};
+
+ cpu4_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu5: cpu@500 {
@@ -220,6 +251,12 @@ l2_500: l2-cache {
cache-unified;
next-level-cache = <&l3_0>;
};
+
+ cpu5_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu6: cpu@600 {
@@ -247,6 +284,12 @@ l2_600: l2-cache {
cache-unified;
next-level-cache = <&l3_0>;
};
+
+ cpu6_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu7: cpu@700 {
@@ -274,6 +317,12 @@ l2_700: l2-cache {
cache-unified;
next-level-cache = <&l3_0>;
};
+
+ cpu7_idle: thermal-idle {
+ #cooling-cells = <2>;
+ duration-us = <800000>;
+ exit-latency-us = <10000>;
+ };
};
cpu-map {
@@ -5752,23 +5801,30 @@ cpu2-top-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu2_top_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu2-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu2_top_alert1>;
+ cooling-device = <&cpu2_idle 100 100>;
+ };
+ };
};
cpu2-bottom-thermal {
@@ -5776,23 +5832,30 @@ cpu2-bottom-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu2_bottom_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu2-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu2_bottom_alert1>;
+ cooling-device = <&cpu2_idle 100 100>;
+ };
+ };
};
cpu3-top-thermal {
@@ -5800,23 +5863,30 @@ cpu3-top-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu3_top_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu3-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu3_top_alert1>;
+ cooling-device = <&cpu3_idle 100 100>;
+ };
+ };
};
cpu3-bottom-thermal {
@@ -5824,23 +5894,30 @@ cpu3-bottom-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu3_bottom_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu3-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu3_bottom_alert1>;
+ cooling-device = <&cpu3_idle 100 100>;
+ };
+ };
};
cpu4-top-thermal {
@@ -5848,23 +5925,30 @@ cpu4-top-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu4_top_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu4-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu4_top_alert1>;
+ cooling-device = <&cpu4_idle 100 100>;
+ };
+ };
};
cpu4-bottom-thermal {
@@ -5872,23 +5956,30 @@ cpu4-bottom-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu4_bottom_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu4-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu4_bottom_alert1>;
+ cooling-device = <&cpu4_idle 100 100>;
+ };
+ };
};
cpu5-top-thermal {
@@ -5896,23 +5987,30 @@ cpu5-top-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu5_top_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu5-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu5_top_alert1>;
+ cooling-device = <&cpu5_idle 100 100>;
+ };
+ };
};
cpu5-bottom-thermal {
@@ -5920,23 +6018,30 @@ cpu5-bottom-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu5_bottom_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu5-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu5_bottom_alert1>;
+ cooling-device = <&cpu5_idle 100 100>;
+ };
+ };
};
cpu6-top-thermal {
@@ -5944,23 +6049,30 @@ cpu6-top-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu6_top_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu6-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu6_top_alert1>;
+ cooling-device = <&cpu6_idle 100 100>;
+ };
+ };
};
cpu6-bottom-thermal {
@@ -5968,23 +6080,30 @@ cpu6-bottom-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu6_bottom_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu6-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu6_bottom_alert1>;
+ cooling-device = <&cpu6_idle 100 100>;
+ };
+ };
};
aoss1-thermal {
@@ -6010,23 +6129,30 @@ cpu7-top-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu7_top_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu7-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu7_top_alert1>;
+ cooling-device = <&cpu7_idle 100 100>;
+ };
+ };
};
cpu7-middle-thermal {
@@ -6034,23 +6160,30 @@ cpu7-middle-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu7_middle_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu7-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu7_middle_alert1>;
+ cooling-device = <&cpu7_idle 100 100>;
+ };
+ };
};
cpu7-bottom-thermal {
@@ -6058,23 +6191,30 @@ cpu7-bottom-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu7_bottom_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu7-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu7_bottom_alert1>;
+ cooling-device = <&cpu7_idle 100 100>;
+ };
+ };
};
cpu0-thermal {
@@ -6082,23 +6222,30 @@ cpu0-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu0_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu0-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu0_alert1>;
+ cooling-device = <&cpu0_idle 100 100>;
+ };
+ };
};
cpu1-thermal {
@@ -6106,23 +6253,30 @@ cpu1-thermal {
trips {
trip-point0 {
- temperature = <90000>;
+ temperature = <108000>;
hysteresis = <2000>;
type = "passive";
};
- trip-point1 {
- temperature = <95000>;
+ cpu1_alert1: trip-point1 {
+ temperature = <110000>;
hysteresis = <2000>;
type = "passive";
};
cpu1-critical {
- temperature = <110000>;
+ temperature = <115000>;
hysteresis = <1000>;
type = "critical";
};
};
+
+ cooling-maps {
+ map0 {
+ trip = <&cpu1_alert1>;
+ cooling-device = <&cpu1_idle 100 100>;
+ };
+ };
};
nsphvx0-thermal {

--
2.34.1