Re: [PATCH v3 4/8] arm64: dts: qcom: kodiak: Enable CDSP & Modem cooling

From: Dmitry Baryshkov

Date: Mon Jun 15 2026 - 11:25:55 EST


On Mon, Jun 15, 2026 at 11:02:22AM +0530, Gaurav Kohli wrote:
>
>
> On 6/9/2026 4:27 PM, Dmitry Baryshkov wrote:
> > On Tue, Jun 09, 2026 at 03:52:59PM +0530, Gaurav Kohli wrote:
> > > Unlike the CPU, the CDSP/Modem does not throttle its speed automatically
> > > when it reaches high temperatures in kodiak.
> > >
> > > Set up CDSP cooling by throttling the cdsp when it reaches 100°C and
> > > for modem when it reaches to 95°C.
> > >
> > > Remove inherited mdmss cooling-map nodes for Non Modem kodiak variant.
> >
> > Why? If it is a GNSS-only MPSS, does it not provide any thermal
> > mitigation mechanisms? Does ADSP provide those? WPSS?
> >
>
> Hi Dmitry,
>
> Thank you for the review.
>
> Since the remoteproc_mpss node doesn't exist on these boards, the
> cooling-maps that reference it cause DT compilation errors. That's why
> we need to remove the inherited cooling-maps from the SoC DTSI.
> /delete-node/ &remoteproc_mpss;

Ok. Explain that in the commit message. And maybe we need to fix those
boards to provide mpss instead.

>
> Regarding thermal mitigation for other subsystems:
> ->CDSP and Modem are the primary heat sources based on our internal
> thermal testing and evaluation.
> ->ADSP and WPSS have lower power consumption and don't typically reach
> thermal thresholds that require active mitigation
> ->For this, I'm checking with our internal team to confirm if ADSP/WPSS
> provide any TMD mechanism across all targets.

They do. I've posted the dump of qrtr-lookup somewhere.

>
> > >
> > > Signed-off-by: Gaurav Kohli <gaurav.kohli@xxxxxxxxxxxxxxxx>
> > > ---
> > > arch/arm64/boot/dts/qcom/kodiak.dtsi | 127 ++++++++++++++++++++-
> > > .../boot/dts/qcom/qcs6490-radxa-dragon-q6a.dts | 17 +++
> >
> > So, you removed those for Radxa Q6A, but not forRB3 Gen2. Why?
> >
>
> Ack, this is a miss. will fix this.
>
> > > .../dts/qcom/qcs6490-thundercomm-minipc-g1iot.dts | 17 +++
> > > .../boot/dts/qcom/qcs6490-thundercomm-rubikpi3.dts | 17 +++
> > > .../boot/dts/qcom/sc7280-herobrine-lte-sku.dtsi | 18 +++
> > > .../boot/dts/qcom/sc7280-herobrine-wifi-sku.dtsi | 16 +++
> > > 6 files changed, 208 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/kodiak.dtsi b/arch/arm64/boot/dts/qcom/kodiak.dtsi
> > > index fa540d8c2615..d345add2d8c8 100644
> > > --- a/arch/arm64/boot/dts/qcom/kodiak.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/kodiak.dtsi
> > > @@ -3427,6 +3427,9 @@ remoteproc_mpss: remoteproc@4080000 {
> > > qcom,smem-states = <&modem_smp2p_out 0>;
> > > qcom,smem-state-names = "stop";
> > > + #cooling-cells = <3>;
> > > + tmd-names = "pa", "modem";
> > > +
> > > status = "disabled";
> > > glink-edge {
> > > @@ -4787,6 +4790,9 @@ remoteproc_cdsp: remoteproc@a300000 {
> > > qcom,smem-states = <&cdsp_smp2p_out 0>;
> > > qcom,smem-state-names = "stop";
> > > + #cooling-cells = <2>;
> > > + tmd-names = "cdsp_sw";
> >
> > I'm going to review only this DT, the comments apply to the rest of
> > them.
> >
> > So, we have two cases, CDSP and MPSS. Why does CDSP have only 2 cells?
> > Just because tmd-names has only one name? What if we add another
> > mitigation (which can be added in the firmware), do we suddenly have to
> > change number of cells and all the cooling devices to reflect it?
> >
>
> As Cdsp has only one relevant tmd to mitigate, so we have used cooling cells
> as 2. But i will change this to 3 as this is backward compatible.
>
> > Finally. If I understand correctly, these mitigtion mechanisms are
> > provided by the firmware. Firmware differs between the boards. Vendors
> > (in theory) can change them. Why do we list these names here, in the SoC
> > DT?
> >
>
> Below are the main reason for this, replied in other thread also.
> Please guide, if i can use qcom_pas_data to define names.
>
> Following Daniel's series [1], the thermal framework supports
> mapping multiple cooling devices per remoteproc/device via indexed
> cooling-cells.
>
> 1) The thermal framework's cooling-maps reference
> cooling devices by index (for #cooling-cells = <3>). Without tmd-names,
> there's no way to know which index corresponds to which TMD, as firmware
> may return tmd-names in any order.

You can #define the indices to the well known names, turning those into
bindings.

>
> below are the changes post new thermal mapping changes:
> DT: tmd-names = "cdsp_sw", "xyz";
> Firmware: ["cdsp_sw", "xyz1", "xyz2",]
> Driver registers: Only "cdsp_sw" (index 0) and "xyz" (index 1)

What if the DT has "cdsp_sw", but the firmware doesn't report it? That's
an error, as it seems.

>
> This allows cooling-maps like below:
> cooling-device = <&remoteproc 0 ...> // "cdsp_sw"
> cooling-device = <&remoteproc 1 ...> // "xyz"
>
> 2) Not all firmware-provided TMDs should be
> exposed as cooling devices. The tmd-names property acts as a filter,
> allowing board-specific DT to select only the relevant TMDs for that
> platform.

But then the properties should be defined in the board DT rather than
the platform DT.

>
> [1]
> https://lore.kernel.org/all/20260526140802.1059293-12-daniel.lezcano@xxxxxxxxxxxxxxxx/
>
> Shall i use pas data to define tmd-names instead of dt ?

Let's settle in the thread with Daniel.

>
> > > +
> > > status = "disabled";
> > > glink-edge {
> > > + cooling-maps {
> > > + map0 {
> > > + trip = <&mdmss0_alert1>;
> > > + cooling-device = <&remoteproc_mpss 0 0 2>;
> >
> > What does this mean? I assume that the first cell is one of the
> > mechanisms. What is the difference between them? Do we really need to
> > list them one by one here?
> >
>
> Let me check, if i can document different tmd's somewhere:
>
> -> modem tmd used for Modem Processor mitigation.
> -> pa is used for Power Amplifier mitigation.

What does that mean?

>
> And we need to list them for binding purpose mainly.
>
> > What do other cells mean? Why are they 0 and 2 rather than
> > THERMAL_NO_LIMIT? How does one come with those values? This should all
> > be documented and explained somewhere.
> >
>
> Will change to THERMAL_NO_LIMIT. Let me check, if i can use
> qli doc for documentation.
>
> > > + };
> > > +
> > > + map1 {
> > > + trip = <&mdmss0_alert1>;
> > > + cooling-device = <&remoteproc_mpss 1 0 2>;
> > > + };
> > > + };
> > > };
> >
>

--
With best wishes
Dmitry