RE: [PATCH 3/3] arm64: zynqmp: Add thermal zones

From: Thangaraj, Senthil Nathan
Date: Tue Sep 03 2024 - 04:33:08 EST


Hi Sean,

Please find my review comments inline.

Thanks,
Senthil.

> -----Original Message-----
> From: Sean Anderson <sean.anderson@xxxxxxxxx>
> Sent: Monday, August 12, 2024 2:51 PM
> To: Simek, Michal <michal.simek@xxxxxxx>; linux-arm-
> kernel@xxxxxxxxxxxxxxxxxxx
> Cc: Rob Herring <robh@xxxxxxxxxx>; Conor Dooley <conor+dt@xxxxxxxxxx>;
> Krzysztof Kozlowski <krzk+dt@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> devicetree@xxxxxxxxxxxxxxx; Sean Anderson <sean.anderson@xxxxxxxxx>
> Subject: [PATCH 3/3] arm64: zynqmp: Add thermal zones
>
> Add some thermal trip points. We can't undervolt the CPUs to save power
> when we underclock them, so there isn't really a point in throttling them until
> we are about to overheat. As such, the passive trip point is right below the
> critical trip point.
>
> The critical trip point is the extended/industrial-grade maximum junction
> temperature of 100C minus the maximum temperature sensor error of 3.5C
> (in the range -55C to 110C). Automotive- and military-grade parts can go up
> to 125C, but as far as I can tell there is no way to detect them at runtime.
> Userspace can adjust the trip points at runtime, but this may not be viable
> when booting above 100C. I think it's reasonable to ask automotive/military
> users to edit their device trees to bump the trip points, but if that proves to be
> an issue we can always go with no default temperatures. However, that
> wouldn't be too nice for the majority of extended/industrial users.
>
> Signed-off-by: Sean Anderson <sean.anderson@xxxxxxxxx>
> ---
>
> arch/arm64/boot/dts/xilinx/zynqmp.dtsi | 86
> ++++++++++++++++++++++++++
> 1 file changed, 86 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> index 21c1adbaf35f..467f084c6469 100644
> --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi
> @@ -18,6 +18,7 @@
> #include <dt-bindings/interrupt-controller/irq.h>
> #include <dt-bindings/power/xlnx-zynqmp-power.h>
> #include <dt-bindings/reset/xlnx-zynqmp-resets.h>
> +#include <dt-bindings/thermal/thermal.h>
>
> / {
> compatible = "xlnx,zynqmp";
> @@ -36,6 +37,7 @@ cpus {
> #size-cells = <0>;
>
> cpu0: cpu@0 {
> + #cooling-cells = <2>;
> compatible = "arm,cortex-a53";
> device_type = "cpu";
> enable-method = "psci";
> @@ -46,6 +48,7 @@ cpu0: cpu@0 {
> };
>
> cpu1: cpu@1 {
> + #cooling-cells = <2>;
> compatible = "arm,cortex-a53";
> device_type = "cpu";
> enable-method = "psci";
> @@ -56,6 +59,7 @@ cpu1: cpu@1 {
> };
>
> cpu2: cpu@2 {
> + #cooling-cells = <2>;
> compatible = "arm,cortex-a53";
> device_type = "cpu";
> enable-method = "psci";
> @@ -66,6 +70,7 @@ cpu2: cpu@2 {
> };
>
> cpu3: cpu@3 {
> + #cooling-cells = <2>;
> compatible = "arm,cortex-a53";
> device_type = "cpu";
> enable-method = "psci";
> @@ -406,6 +411,87 @@ ams {
> <&xilinx_ams 27>, <&xilinx_ams 28>, <&xilinx_ams
> 29>;
> };
>
> +
> + tsens_apu: thermal-sensor-apu {
> + compatible = "generic-adc-thermal";
> + #thermal-sensor-cells = <0>;
> + io-channels = <&xilinx_ams 7>;
> + io-channel-names = "sensor-channel";
> + };
> +
> + tsens_rpu: thermal-sensor-rpu {
> + compatible = "generic-adc-thermal";
> + #thermal-sensor-cells = <0>;
> + io-channels = <&xilinx_ams 8>;
> + io-channel-names = "sensor-channel";
> + };
> +
> + tsens_pl: thermal-sensor-pl {
> + compatible = "generic-adc-thermal";
> + #thermal-sensor-cells = <0>;
> + io-channels = <&xilinx_ams 20>;
> + io-channel-names = "sensor-channel";
> + };
> +
> + thermal-zones {
> + apu-thermal {
> + polling-delay-passive = <1000>;
> + polling-delay = <5000>;

How did we arrive at these delays, 1000 and 5000 ? could you please clarify ?

> + thermal-sensors = <&tsens_apu>;
> +
> + trips {
> + apu_passive: passive {
> + temperature = <93000>;
> + hysteresis = <3500>;
> + type = "passive";
> + };
> +
> + apu_critical: critical {
> + temperature = <96500>;
> + hysteresis = <3500>;
> + type = "critical";
> + };
> + };
> +
> + cooling-maps {
> + map {
> + trip = <&apu_passive>;
> + cooling-device =
> + <&cpu0 THERMAL_NO_LIMIT
> THERMAL_NO_LIMIT>,
> + <&cpu1 THERMAL_NO_LIMIT
> THERMAL_NO_LIMIT>,
> + <&cpu2 THERMAL_NO_LIMIT
> THERMAL_NO_LIMIT>,
> + <&cpu3 THERMAL_NO_LIMIT
> THERMAL_NO_LIMIT>;
> + };
> + };
> + };
> +
> + rpu-thermal {
> + polling-delay = <10000>;

Same questions, how did we come up with number 10000

> + thermal-sensors = <&tsens_rpu>;
> +
> + trips {
> + critical {
> + temperature = <96500>;
> + hysteresis = <3500>;
> + type = "critical";
> + };

Any reason why we haven't defined for passive trip point for RPU ?

> + };
> + };
> +
> + pl-thermal {
> + polling-delay = <10000>;
> + thermal-sensors = <&tsens_pl>;
> +
> + trips {
> + critical {
> + temperature = <96500>;
> + hysteresis = <3500>;
> + type = "critical";
> + };
> + };
Same question, any reason why we haven't defined for passive trip point for PL ?
> + };
> + };
> +
> amba: axi {
> compatible = "simple-bus";
> bootph-all;
> --
> 2.35.1.1320.gc452695387.dirty