Re: [PATCH 2/2] arm64: dts: qcom: SM8750: Enable CPUFreq support

From: Akhil P Oommen

Date: Wed Jan 21 2026 - 07:57:58 EST

On 1/21/2026 5:06 PM, Konrad Dybcio wrote:
> On 1/20/26 9:54 PM, Akhil P Oommen wrote:
>> On 1/20/2026 8:13 PM, Konrad Dybcio wrote:
>>> On 1/20/26 12:25 PM, Akhil P Oommen wrote:
>>>> On 1/20/2026 3:44 PM, Konrad Dybcio wrote:
>>>>> On 1/19/26 8:00 PM, Akhil P Oommen wrote:
>>>>>> On 12/11/2025 12:32 AM, Jagadeesh Kona wrote:
>>>>>>> Add the cpucp mailbox, sram and SCMI nodes required to enable
>>>>>>> the CPUFreq support using the SCMI perf protocol on SM8750 SoCs.
>>>>>>>
>>>>>>> Signed-off-by: Jagadeesh Kona <jagadeesh.kona@xxxxxxxxxxxxxxxx>
>>>>>>
>>>>>> Just curious, does this patch enable thermal mitigation for CPU clusters
>>>>>> too?
>>>>>
>>>>> If nothing changed, we have lets-not-explode type mitigations via LMH,
>>>>> but lets-not-burn-the-user would require a skin temp sensor to be
>>>>> wired up, which then could be used to enable some cooling action
>>>>
>>>> In some chipsets, I have noticed that the gpu cooling device throttles
>>>> GPU to the lowest OPP even with not-so-heavy GPU workloads, making it
>>>> unusable-ly slow. My hypothesis was that it was due to unmitigated CPU
>>>> temperature tripping up GPU Tsens.
>>>>
>>>> So, I am wondering if there are any additional CPU cooling related
>>>> changes required to get a reasonable overall performance under thermal
>>>> constraints.
>>>
>>> Yes, something like the aforementioned skin-temp sensor at least..
>>
>> I suppose this sensor is off-chip and slow to react.
>
> Yes, this would be placed somewhere on the chassis of the device to
> reflect the actual temperature that the user could experience (since
> there are regulations about maximum values of that)
>
>>> Today Linux will not throttle the CPUs at all (they're not even declared
>>> as cooling devices) and we sorta agreed that in general it's a good thing
>>> (tm), because otherwise we'd be coding in a cooling profile into the SoC
>>> DTSI without taking into account the cooling capabilities of a given end
>>> device (i.e. in an extreme case, a PC with SM8650 with a cooler that's
>>> 3kg of aluminium vs a Steam Frame headset where the SoC is centimeters
>>> away from your face)
>>>
>>> Currently, we have cooling policies for devices with fans and the only
>>> other action is based on a skin temperature sensor (sc8280xp + x13s).
>>> Everything else is left up to the LMH defaults. AFAIK work is ongoing to
>>> create a more informed solution, that would have to (quite obviously)
>>> live in userland.
>>
>> I can understand that the skin-temp based mitigation is influenced by
>> various design decision outside of the SoC die. But I think there should
>> an on-chip sensor based mitigation which is fast and usually for a short
>> duration which helps to avoid extreme temperature or violating the
>> maximum operating point of the chipset. I guess it should depend only on
>> the SoC characteristics as it is a last resort. It may be implemented in
>> SW (like cooling device for Adreno GPU) or in HW. Probably the LMH
>> hardware you mentioned offers this functionality for CPU clusters. I
>> have no clue. :(
>
> Yes, the CPUs are covered.

Does this LMH based thermal migitation require any initialization from
Linux? If yes, could you please share a link to a patch which enables it
for any recent chipset for my reference?

-Akhil.

>
>> I am hoping that if this on-chip mitigation is enabled and wired up
>> correctly for CPU clusters (probably DDR too), it would reduce the
>> unnecessary thermal trips on GPU Tsens and help to reach a performance
>> equilibrium which is reasonably good.
>
> Today, the OS is unaware that it can throttle anything else than the
> GPU, so in its view that's the reasonable step to take. Further, any
> device it knows how to throttle, it'll do so in a very jittery fashion
> where it crosses the threshold, gets slowed down, cools a bit, gets
> unthrottled, heats back up, rinse and repeat (because the cooling
> solution of almost any form-factor is not capable of sustaining a
> 100%usage workload for long)
>
> Konrad