Re: [PATCH v8 01/29] thermal/core: Add a generic thermal_zone_get_trip() function

From: Ido Schimmel
Date: Sun Mar 12 2023 - 08:14:35 EST


On Mon, Oct 03, 2022 at 11:25:34AM +0200, Daniel Lezcano wrote:
> @@ -1252,9 +1319,10 @@ thermal_zone_device_register_with_trips(const char *type, struct thermal_trip *t
> goto release_device;
>
> for (count = 0; count < num_trips; count++) {
> - if (tz->ops->get_trip_type(tz, count, &trip_type) ||
> - tz->ops->get_trip_temp(tz, count, &trip_temp) ||
> - !trip_temp)
> + struct thermal_trip trip;
> +
> + result = thermal_zone_get_trip(tz, count, &trip);
> + if (result)
> set_bit(count, &tz->trips_disabled);
> }

Daniel, this change makes it so that trip points with a temperature of
zero are no longer disabled. This behavior was originally added in
commit 81ad4276b505 ("Thermal: Ignore invalid trip points"). The mlxsw
driver relies on this behavior - see mlxsw_thermal_module_trips_reset()
- and with this change I see that the thermal subsystem tries to
repeatedly set the state of the associated cooling devices to the
maximum state. Other drivers might also be affected by this.

Following patch solves the problem for me:

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 55679fd86505..b50931f84aaa 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1309,7 +1309,7 @@ thermal_zone_device_register_with_trips(const char *type, struct thermal_trip *t
struct thermal_trip trip;

result = thermal_zone_get_trip(tz, count, &trip);
- if (result)
+ if (result || !trip.temperature)
set_bit(count, &tz->trips_disabled);
}

Should I submit it or do you have a better idea?

Thanks