Re: [PATCH] Fix mt7622.dtsi thermal cpu

From: Eric Woudstra
Date: Thu Jun 24 2021 - 05:59:40 EST



For Marvell:

https://www.google.com/url?sa=t&source=web&rct=j&url=https://wiki.kobol.io/helios4/files/som/brochure_a38x_microsom_2017-09-05.pdf

Armada38x maximum die temperature 115 degrees Celcius. They really get hotter then 100.

But for mt7622 I cannot find this value

⁣Get BlueMail for Android ​

On Jun 23, 2021, 10:08 PM, at 10:08 PM, Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
>On 23/06/2021 20:43, Eric Woudstra wrote:
>>
>> I choose "hot" before, because 87 degrees seems ok to start frequency
>> throttling. But, yes, it should be passive.
>>
>> 87 is still quite low if I compare this temperature with the
>> wrt3200acm Marvell dual core arm soc. They even went above 100
>> degrees so I feel for an arm processor inside a router box it is fine
>> to use 87 degrees But maybe someone at Mediatek can give some more
>> details about operating temperatures.
>
>Sometimes, the SoC vendor puts a high temperature in the DT just to
>export the thermal zone and deal with it from userspace. So putting the
>high temp allow the userspace (usually a thermal engine - Android
>stuff)
>to deal with the mitigation without a kernel interaction.
>
>Having more than 100°C could be this kind of setup. Only the operating
>temperature from the hardware documentation will tell the safe
>temperature for the silicon.
>
>IMO, 77°C is a good compromise until getting the documented temp. 87°C
>sounds to me a bit too hot.
>
>> It may be possible to leave the active map in the device tree as some
>> users of the bananapi might choose to install a fan as it is one of
>> the options.
>
>The active trip only makes sense if the cooling device is a fan (or any
>active device), so the mapping points to a fan node, like:
>
>https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git/tree/arch/arm64/boot/dts/rockchip/rk3399-khadas-edge.dtsi#n192
>
>If there is no such [pwm] fan output on the board, no active trip point
>should be added.
>
>> ⁣Get BlueMail for Android ​
>>
>> On Jun 23, 2021, 5:58 PM, at 5:58 PM, Daniel Lezcano
>> <daniel.lezcano@xxxxxxxxxx> wrote:
>>> On 23/06/2021 17:35, Eric Woudstra wrote:
>>>> It is only useful to set 1 map with the regulated temperature for
>>>> cpu frequency throttling. Same as in the kernel document
>>>> example.
>>>>
>>>>
>>>> It has no use to set frequency scaling on 2 different
>>>> temperature trip points, as the lowest one makes sure the higher
>>>> one(s) are never reached.
>>>
>>> I looked more closely the DT and there is a misunderstanding of
>>> the thermal framework in the definition.
>>>
>>> There is one trip point with the passive type and the cpu cooling
>>> device, followed by a second trip point with the active type *but*
>>> the same cpu cooling device. That is wrong.
>>>
>>> And finally, there is the hot trip point as a third mapping and
>>> the same cooling device.
>>>
>>> The hot trip point is only there to notify userspace and let it
>>> take an immediate action to prevent an emergency shutdown when
>>> reaching the critical temperature.
>>>
>>>> It can be applied only at 1 trip point. Multiple trip points is
>>>> only usefully for fan control to make sure the fan is not too
>>>> noisy when it is not necessary to be noisy.
>>>>
>>>>
>>>> The CPU will almost come to a dead stop when it starts to pass
>>>> the lowest thermal map with frequency throttling.
>>>>
>>>> This is why it is a bug and needs a fix, not only adjustment.
>>>
>>> Yes, you are right. It should be something like (verbatim copy):
>>>
>>> diff --git a/arch/arm64/boot/dts/mediatek/mt7622.dtsi
>>> b/arch/arm64/boot/dts/mediatek/mt7622.dtsi index
>>> 890a942ec608..88c81d24f4ff 100644 ---
>>> a/arch/arm64/boot/dts/mediatek/mt7622.dtsi +++
>>> b/arch/arm64/boot/dts/mediatek/mt7622.dtsi @@ -136,24 +136,18 @@
>>> secmon_reserved: secmon@43000000 {
>>>
>>> thermal-zones { cpu_thermal: cpu-thermal { -
>>> polling-delay-passive = <1000>; + polling-delay-passive = <250>;
>>> polling-delay = <1000>;
>>>
>>> thermal-sensors = <&thermal 0>;
>>>
>>> trips { cpu_passive: cpu-passive { - temperature = <47000>; +
>>> temperature = <77000>; hysteresis = <2000>; type = "passive"; };
>>>
>>> - cpu_active: cpu-active { - temperature = <67000>; -
>>> hysteresis = <2000>; - type = "active"; - }; - cpu_hot:
>>> cpu-hot { temperature = <87000>; hysteresis = <2000>; @@ -173,18
>>> +167,6 @@ map0 { cooling-device = <&cpu0 THERMAL_NO_LIMIT
>>> THERMAL_NO_LIMIT>, <&cpu1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; -
>>> - map1 { - trip = <&cpu_active>; - cooling-device =
>>> <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, - <&cpu1
>>> THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; - }; - - map2 { -
>>> trip = <&cpu_hot>; - cooling-device = <&cpu0 THERMAL_NO_LIMIT
>>> THERMAL_NO_LIMIT>, - <&cpu1 THERMAL_NO_LIMIT
>>> THERMAL_NO_LIMIT>; - }; }; }; };
>>>
>>>
>>> -- <http://www.linaro.org/> Linaro.org │ Open source software for
>>> ARM SoCs
>>>
>>> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
>>> <http://twitter.com/#!/linaroorg> Twitter |
>>> <http://www.linaro.org/linaro-blog/> Blog
>>
>
>
>--
><http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>
>Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
><http://twitter.com/#!/linaroorg> Twitter |
><http://www.linaro.org/linaro-blog/> Blog