RE: [RFC PATCH 0/3] thermal: Add CPU hotplug cooling driver

From: Biju Das
Date: Mon Mar 10 2025 - 06:18:02 EST


Hi John,

Thanks for the patch.

> -----Original Message-----
> From: John Madieu <john.madieu.xa@xxxxxxxxxxxxxx>
> Sent: 09 March 2025 12:13
> Subject: [RFC PATCH 0/3] thermal: Add CPU hotplug cooling driver
>
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> This patch series introduces a new thermal cooling driver that implements CPU hotplug-based thermal
> management. The driver dynamically takes CPUs offline during thermal excursions to reduce power
> consumption and prevent overheating, while maintaining system stability by keeping at least one CPU
> online.
>
> 1- Problem Statement
>
> Modern SoCs require robust thermal management to prevent overheating under heavy workloads. Existing
> cooling mechanisms like frequency scaling may not always provide sufficient thermal relief, especially
> in multi-core systems where per-core thermal contributions can be significant.
>
> 2- Solution Overview
>
> The driver:
>
> - Integrates with the Linux thermal framework as a cooling device
> - Registers per-CPU cooling devices that respond to thermal trip points
> - Uses CPU hotplug operations to reduce thermal load
> - Maintains system stability by preserving the boot CPU from being put offline, regardless the CPUs
> that are specified in cooling device list.
> - Implements proper state tracking and cleanup
>
> Key Features:
>
> - Dynamic CPU online/offline management based on thermal thresholds
> - Device tree-based configuration via thermal zones and trip points
> - Hysteresis support through thermal governor interactions
> - Safe handling of CPU state transitions during module load/unload
> - Compatibility with existing thermal management frameworks
>
> Testing
>
> - Verified on Renesas RZ/G3E platforms with multi-core CPU configurations
> - Validated thermal response using artificial load generation (emul_temp)
> - Confirmed proper interaction with other cooling devices
> - Verified support for 'plug' type trace events
> - Tested with step_wise governor
>
> As the 'hot' type is already used for user space notification, I've choosen 'plug' for this new type.
> suggestions on this are welcome. Here is an example of 'thermal-zone' that integrate 'plug' type:
>
> ```
> thermal-zones {
> cpu-thermal {
> polling-delay = <1000>;
> polling-delay-passive = <250>;
> thermal-sensors = <&tsu>;
>
> cooling-maps {
> map0 {
> trip = <&target>;
> cooling-device = <&cpu0 0 3>, <&cpu3 0 3>;
> contribution = <1024>;
> };

Is it not possible here to make cpu1 and cpu2 as well for DVFS passive cooling?

>
> map1 {
> trip = <&trip_emergency>;
> cooling-device = <&cpu1 0 1>, <&cpu2 0 1>;
> contribution = <1024>;
> };
>
> };

Is it not possible here to make cpu3 as well as hot pluggable device for cooling?

Cheers,
Biju