[PATCH v1] thermal: core: Fix the handling of invalid trip points

From: Rafael J. Wysocki
Date: Thu May 16 2024 - 13:20:33 EST


From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>

Commit 9ad18043fb35 ("thermal: core: Send trip crossing notifications
at init time if needed") overlooked the case when a trip point that
has started as invalid is set to a valid temperature later. Namely,
the initial threshold value for all trips is zero, so if a previously
invalid trip becomes valid and its (new) low temperature is above the
zone temperature, a spurious trip crossing notification will occur and
it may trigger the WARN_ON() in handle_thermal_trip().

To address this, set the initial threshold for all trips to INT_MAX.

There is also the case when a valid writable trip becomes invalid that
requires special handling. First, in accordance with the change
mentioned above, the trip's threshold needs to be set to INT_MAX to
avoid the same issue. Second, if the trip in question is passive and
it has been crossed by the thermal zone temperature on the way up, the
zone's passive count has been incremented and it is in the passive
polling mode, so its passive count needs to be adjusted to allow the
passive polling to be turned off eventually.

Fixes: 9ad18043fb35 ("thermal: core: Send trip crossing notifications at init time if needed")
Fixes: 042a3d80f118 ("thermal: core: Move passive polling management to the core")
Reported-by: Zhang Rui <rui.zhang@xxxxxxxxx>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
---
drivers/thermal/thermal_core.c | 9 ++++++++-
drivers/thermal/thermal_trip.c | 18 ++++++++++++++++++
2 files changed, 26 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -1401,8 +1401,15 @@ thermal_zone_device_register_with_trips(
tz->device.class = thermal_class;
tz->devdata = devdata;
tz->num_trips = num_trips;
- for_each_trip_desc(tz, td)
+ for_each_trip_desc(tz, td) {
td->trip = *trip++;
+ /*
+ * Mark all thresholds as invalid to start with even though
+ * this only matters for the trips that start as invalid and
+ * become valid later.
+ */
+ td->threshold = INT_MAX;
+ }

thermal_set_delay_jiffies(&tz->passive_delay_jiffies, passive_delay);
thermal_set_delay_jiffies(&tz->polling_delay_jiffies, polling_delay);
Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -137,6 +137,24 @@ void thermal_zone_set_trip_temp(struct t
if (trip->temperature == temp)
return;

+ if (temp == THERMAL_TEMP_INVALID) {
+ struct thermal_trip_desc *td = trip_to_trip_desc(trip);
+
+ if (trip->type == THERMAL_TRIP_PASSIVE &&
+ tz->temperature >= td->threshold) {
+ /*
+ * The trip has been crossed, so the thermal zone's
+ * passive count needs to be adjusted.
+ */
+ tz->passive--;
+ WARN_ON_ONCE(tz->passive < 0);
+ }
+ /*
+ * Invalidate the threshold to avoid triggering a spurious
+ * trip crossing notification when the trip becomes valid.
+ */
+ td->threshold = INT_MAX;
+ }
WRITE_ONCE(trip->temperature, temp);
thermal_notify_tz_trip_change(tz, trip);
}