Hi Thara,
On 1/10/2022 11:25 PM, Thara Gopinath wrote:
Hi Manaf,
On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
Whenever a thermal zone is in trip violated state, there is a chance
that the same thermal zone mode can be disabled either via thermal
core API or via thermal zone sysfs. Once it is disabled, the framework
bails out any re-evaluation of thermal zone. It leads to a case where
if it is already in mitigation state, it will stay the same state
until it is re-enabled.
To avoid above mentioned issue, on thermal zone disable request
reset thermal zone and clear mitigation for each trip explicitly.
Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@xxxxxxxxxxx>
---
drivers/thermal/thermal_core.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 51374f4..e288c82 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
- if (mode == THERMAL_DEVICE_ENABLED)
+ if (mode == THERMAL_DEVICE_ENABLED) {
thermal_notify_tz_enable(tz->id);
- else
+ } else {
+ int trip;
+
+ /* make sure all previous throttlings are cleared */
+ thermal_zone_device_init(tz);
It looks weird to do a init when you are actually disabling the thermal zone.
+ for (trip = 0; trip < tz->trips; trip++)
+ handle_thermal_trip(tz, trip);
So this is exactly what thermal_zone_device_update does except that thermal_zone_device_update checks for the mode and bails out if the zone is disabled.
This will work because as you explained in v2, the temperature is reset in thermal_zone_device_init and handle_thermal_trip will remove the mitigation if any.
My two cents here (Rafael and Daniel can comment more on this).
I think it will be cleaner if we can have a third mode THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle clearing the mitigation. So this will look like
if (mode == THERMAL_DEVICE_DISABLED)
tz->mode = THERMAL_DEVICE_DISABLING;
else
tz->mode = mode;
thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
if (mode == THERMAL_DEVICE_DISABLED)
tz->mode = mode;
You will have to update update_temperature to set tz->temperature = THERMAL_TEMP_INVALID and thermal_zone_set_trips to set tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
THERMAL_DEVICE_DISABLING mode.
I think just updating above fields doesn't guarantee complete clearing of mitigation for all governors. For step_wise governor, to make sure mitigation removed completely, we have to set each thermal-instance->initialized = false as well.
If we add that to above list of variables in update_temperature() under if (mode == THERMAL_DEVICE_DISABLING) , it is same as thermal_zone_device_init function does in current patch. We are just resetting same fields in different place under a new mode, right ?
Thanks,
Manaf