Re: [PATCH v5 6/6] arm64: dts: qcom: Enable cpu cooling devices for QCS9075 platforms
From: Konrad Dybcio
Date: Thu Jan 09 2025 - 09:30:48 EST
On 8.01.2025 5:08 PM, Manaf Meethalavalappu Pallikunhi wrote:
>
> Hi Dmitry,
>
>
> On 1/8/2025 6:16 PM, Dmitry Baryshkov wrote:
>> On Wed, Jan 08, 2025 at 05:57:06PM +0530, Manaf Meethalavalappu Pallikunhi wrote:
>>> Hi Dmitry,
>>>
>>>
>>> On 1/3/2025 11:21 AM, Dmitry Baryshkov wrote:
>>>> On Tue, Dec 31, 2024 at 05:31:41PM +0530, Manaf Meethalavalappu Pallikunhi wrote:
>>>>> Hi Dmitry,
>>>>>
>>>>> On 12/30/2024 9:10 PM, Dmitry Baryshkov wrote:
>>>>>> On Sun, Dec 29, 2024 at 08:53:32PM +0530, Wasim Nazir wrote:
>>>>>>> From: Manaf Meethalavalappu Pallikunhi <quic_manafm@xxxxxxxxxxx>
>>>>>>>
>>>>>>> In QCS9100 SoC, the safety subsystem monitors all thermal sensors and
>>>>>>> does corrective action for each subsystem based on sensor violation
>>>>>>> to comply safety standards. But as QCS9075 is non-safe SoC it
>>>>>>> requires conventional thermal mitigation to control thermal for
>>>>>>> different subsystems.
>>>>>>>
>>>>>>> The cpu frequency throttling for different cpu tsens is enabled in
>>>>>>> hardware as first defense for cpu thermal control. But QCS9075 SoC
>>>>>>> has higher ambient specification. During high ambient condition, even
>>>>>>> lowest frequency with multi cores can slowly build heat over the time
>>>>>>> and it can lead to thermal run-away situations. This patch restrict
>>>>>>> cpu cores during this scenario helps further thermal control and
>>>>>>> avoids thermal critical violation.
>>>>>>>
>>>>>>> Add cpu idle injection cooling bindings for cpu tsens thermal zones
>>>>>>> as a mitigation for cpu subsystem prior to thermal shutdown.
>>>>>>>
>>>>>>> Add cpu frequency cooling devices that will be used by userspace
>>>>>>> thermal governor to mitigate skin thermal management.
>>>>>> Does anything prevent us from having this config as a part of the basic
>>>>>> sa8775p.dtsi setup? If HW is present in the base version but it is not
>>>>>> accessible for whatever reason, please move it the base device config
>>>>>> and use status "disabled" or "reserved" to the respective board files.
>>>>> Sure, I will move idle injection node for each cpu to sa8775p.dtsi and keep
>>>>> it disabled state. #cooling cells property for CPU, still wanted to keep it
>>>>> in board files as we don't want to enable any cooling device in base DT.
>>>> "we don't want" is not a proper justification. So, no.
>>> As noted in the commit, thermal cooling mitigation is only necessary for
>>> non-safe SoCs. Adding this cooling cell property to the CPU node in the base
>>> DT (sa8775p.dtsi), which is shared by both safe and non-safe SoCs, would
>>> violate the requirements for safe SoCs. Therefore, we will include it only
>>> in non-safe SoC boards.
>> "is only necessary" is fine. It means that it is an optional part which
>> is going to be unused / ignored / duplicate functionality on the "safe"
>> SoCs. What kind of requirement is going to be violated in this way?
>
> From the perspective of a safe SoC, any software mitigation that compromises the safety subsystem’s compliance should not be allowed. Enabling the cooling device also opens up the sysfs interface for userspace, which we may not fully control. Userspace apps or partner apps might inadvertently use it. Therefore, we believe it is better not to expose such an interface, as it is not required for that SoC and helps to avoid opening up an interface that could potentially lead to a safety failure.
So, to recalibrate, would this be accurate?:
* "safe" SoCs are the ones with a SAIL/Safety Island block which
performs thermal throttling without OS intervention and does so
on all SAIL-equipped platforms
* SoCs with a SAIL are intended to be used in e.g. cars and if we
want to keep mainline viable on those, we must comply with some
regulations, which prevent e.g. software thermal throttling
* idle injection provides measurable improvements over software-
based frequency throttling on platforms with SAIL
Konrad