[RFC PATCH 0/3] thermal: Add CPU hotplug cooling driver
From: John Madieu
Date: Sun Mar 09 2025 - 08:13:52 EST
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This patch series introduces a new thermal cooling driver that implements CPU
hotplug-based thermal management. The driver dynamically takes CPUs offline
during thermal excursions to reduce power consumption and prevent overheating,
while maintaining system stability by keeping at least one CPU online.
1- Problem Statement
Modern SoCs require robust thermal management to prevent overheating under heavy
workloads. Existing cooling mechanisms like frequency scaling may not always
provide sufficient thermal relief, especially in multi-core systems where
per-core thermal contributions can be significant.
2- Solution Overview
The driver:
- Integrates with the Linux thermal framework as a cooling device
- Registers per-CPU cooling devices that respond to thermal trip points
- Uses CPU hotplug operations to reduce thermal load
- Maintains system stability by preserving the boot CPU from being put offline,
regardless the CPUs that are specified in cooling device list.
- Implements proper state tracking and cleanup
Key Features:
- Dynamic CPU online/offline management based on thermal thresholds
- Device tree-based configuration via thermal zones and trip points
- Hysteresis support through thermal governor interactions
- Safe handling of CPU state transitions during module load/unload
- Compatibility with existing thermal management frameworks
Testing
- Verified on Renesas RZ/G3E platforms with multi-core CPU configurations
- Validated thermal response using artificial load generation (emul_temp)
- Confirmed proper interaction with other cooling devices
- Verified support for 'plug' type trace events
- Tested with step_wise governor
As the 'hot' type is already used for user space notification, I've choosen
'plug' for this new type. suggestions on this are welcome. Here is an example
of 'thermal-zone' that integrate 'plug' type:
```
thermal-zones {
cpu-thermal {
polling-delay = <1000>;
polling-delay-passive = <250>;
thermal-sensors = <&tsu>;
cooling-maps {
map0 {
trip = <&target>;
cooling-device = <&cpu0 0 3>, <&cpu3 0 3>;
contribution = <1024>;
};
map1 {
trip = <&trip_emergency>;
cooling-device = <&cpu1 0 1>, <&cpu2 0 1>;
contribution = <1024>;
};
};
trips {
target: trip-point {
temperature = <95000>;
hysteresis = <1000>;
type = "passive";
};
trip_emergency: emergency {
temperature = <110000>;
hysteresis = <1000>;
type = "plug";
};
sensor_crit: sensor-crit {
temperature = <120000>;
hysteresis = <1000>;
type = "critical";
};
};
};
};
```
Dependencies
- Requires standard thermal framework components (CONFIG_THERMAL)
- Depends on CPU hotplug support (CONFIG_HOTPLUG_CPU)
- Assumes device tree contains appropriate thermal zone definitions
This series also depends upon [1], more precisely on patch 6/7,
arm64: dts: renesas: r9a09g047: Add TSU node.
3) Notes for Reviewers
- Focus areas: Thermal framework integration, CPU state management, and error handling
- Feedback on device tree binding requirements is particularly welcome
- Suggestions for interaction improvements with other governors are appreciated
I look forward to your feedback and guidance on this contribution.
[1] https://patchwork.kernel.org/project/linux-clk/cover/20250227122453.30480-1-john.madieu.xa@xxxxxxxxxxxxxx/
Regards,
John
John Madieu (3):
thermal/cpuplog_cooling: Add CPU hotplug cooling driver
tmon: Add support for THERMAL_TRIP_PLUG type
arm64: dts: renesas: r9a09g047: Add thermal hotplug trip point
arch/arm64/boot/dts/renesas/r9a09g047.dtsi | 13 +
drivers/thermal/Kconfig | 12 +
drivers/thermal/Makefile | 1 +
drivers/thermal/cpuplug_cooling.c | 363 +++++++++++++++++++++
drivers/thermal/thermal_of.c | 1 +
drivers/thermal/thermal_trace.h | 2 +
drivers/thermal/thermal_trip.c | 1 +
include/uapi/linux/thermal.h | 1 +
tools/thermal/tmon/tmon.h | 1 +
tools/thermal/tmon/tui.c | 3 +-
10 files changed, 397 insertions(+), 1 deletion(-)
create mode 100644 drivers/thermal/cpuplug_cooling.c
--
2.25.1