Re: [RFC 3/3] ARM: dts: Don't overheat the Odroid XU3-Lite on high load

From: Anand Moon
Date: Wed Feb 17 2016 - 14:54:15 EST


Hi Krzysztof,

On 17 February 2016 at 12:25, Krzysztof Kozlowski
<k.kozlowski@xxxxxxxxxxx> wrote:
> After adding cpufreq-dt support to Exynos542x, the Odroid XU3-Lite can
> be easily overheated when launching eight CPU-intensive tasks:
> thermal thermal_zone3: critical temperature reached(121 C),shutting down
>
> This seems to be specific to Odroid XU3-Lite board which officially
> supports lower frequencies than regular XU3 or XU4. When working at
> maximum CPU speed (1800 MHz big and 1300 MHz LITTLE) in warmer place for
> longer time, the fan fails to cool down the board and it reaches
> critical temperature.
>
> Add CPU cooling to Exynos5422/5800 to fix this issue. When reaching 95
> degrees of Celsius, the board will slow down by 3 steps (around
> 1400/1000 MHz). When reaching 110 degrees of Celsius go to 600 MHz.
>
> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@xxxxxxxxxxx>
> ---
> arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi | 41 +++++++++++++++++++++++++++
> 1 file changed, 41 insertions(+)
>
> diff --git a/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi b/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
> index 2b289d7c0d13..66073ce29aee 100644
> --- a/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
> +++ b/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
> @@ -34,6 +34,16 @@
> hysteresis = <5000>; /* millicelsius */
> type = "active";
> };
> + cpu_alert3: cpu-alert-3 {
> + temperature = <95000>; /* millicelsius */
> + hysteresis = <5000>; /* millicelsius */
> + type = "passive";
> + };
> + cpu_alert4: cpu-alert-4 {
> + temperature = <110000>; /* millicelsius */
> + hysteresis = <5000>; /* millicelsius */
> + type = "passive";
> + };
> cpu_crit0: cpu-crit-0 {
> temperature = <120000>; /* millicelsius */
> hysteresis = <0>; /* millicelsius */
> @@ -53,6 +63,37 @@
> trip = <&cpu_alert2>;
> cooling-device = <&fan0 2 3>;
> };
> +
> + /*
> + * When reaching cpu_alert3, reduce CPU
> + * by 3 steps. On Exynos5422/5800 that would
> + * be: 1400 MHz and 1000 MHz.
> + */
> + map3 {
> + trip = <&cpu_alert3>;
> + cooling-device = <&cpu0 3 3>;
> + };
> + map4 {
> + trip = <&cpu_alert3>;
> + cooling-device = <&cpu4 3 3>;
> + };
> +
> + /*
> + * When reaching cpu_alert4, reduce CPU
> + * to 600 MHz (11 steps for big, 7 steps for
> + * LITTLE).
> + * Exynos5420 has less OPPs and reversed
> + * numbering of CPUs (big/LITTLE) so this
> + * would not match.
> + */
> + map5 {
> + trip = <&cpu_alert4>;
> + cooling-device = <&cpu0 7 7>;
> + };
> + map6 {
> + trip = <&cpu_alert4>;
> + cooling-device = <&cpu4 11 11>;
> + };
> };
> };
> };
> --
> 2.5.0
>

could you append this patch with following changes.

diff --git a/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
b/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
index 66073ce..4e72637 100644
--- a/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
+++ b/arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
@@ -16,8 +16,8 @@
thermal-zones {
cpu0_thermal: cpu0-thermal {
thermal-sensors = <&tmu_cpu0 0>;
- polling-delay-passive = <0>;
- polling-delay = <0>;
+ polling-delay-passive = <250>; /* milliseconds */
+ polling-delay = <500>; /* milliseconds */
trips {
cpu_alert0: cpu-alert-0 {
temperature = <50000>; /*
millicelsius */
---
On running linaro pm-qa diagnostic tool
----------------------------------------------------------

thermal_01.28: checking 'thermal_zone2'/'trip_point_2_temp' ='110000'... Ok
thermal_01.29: checking 'cdev0_trip_point' exists in
'/sys/devices/virtual/thermal/thermal_zone0'... Ok
thermal_01.30: checking 'thermal_zone0/cdev0_trip_point' valid binding... Ok
thermal_01.31: checking 'cdev4_trip_point' exists in
'/sys/devices/virtual/thermal/thermal_zone0'... Ok
thermal_01.32: checking 'thermal_zone0/cdev4_trip_point' valid binding... Err
thermal_01.33: checking 'cdev4_trip_point' exists in
'/sys/devices/virtual/thermal/thermal_zone0'... Ok
thermal_01.34: checking 'thermal_zone0/cdev4_trip_point' valid binding... Err
thermal_01.35: checking 'cdev4_trip_point' exists in
'/sys/devices/virtual/thermal/thermal_zone0'... Ok
thermal_01.36: checking 'thermal_zone0/cdev4_trip_point' valid binding... Err
thermal_01.37: checking 'cdev4_trip_point' exists in
'/sys/devices/virtual/thermal/thermal_zone0'... Ok
thermal_01.38: checking 'thermal_zone0/cdev4_trip_point' valid binding... Err

thermal_01: fail
-------------------------------------------------------
I also got lot's of error.

root@odroidxu4l:~# cpu[ 3050.847663] cpu cpu4: Failed to find dev_opp: -19
[ 3171.640836] cpu cpu4: device_opp_debug_create_link: Failed to create link
[ 3171.646197] cpu cpu4: _add_list_dev: Failed to register opp debugfs (-12)
[ 3171.653574] cpu cpu7: device_opp_debug_create_link: Failed to create link
[ 3171.659752] cpu cpu7: _add_list_dev: Failed to register opp debugfs (-12)
[ 3171.697011] cpu cpu5: cpufreq_init: failed to get clk: -2
[ 3171.732505] cpu cpu6: cpufreq_init: failed to get clk: -2
[ 3171.768160] cpu cpu7: cpufreq_init: failed to get clk: -2

Tested on Odroid-XU4

Reviewed-by: Anand Moon <linux.amoon@xxxxxxxxx>
Tested-by: Anand Moon <linux.amoon@xxxxxxxxx>

Best Regards
-Anand Moon

> --
> To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html