Re: [PATCH v3] thermal/core: Clear all mitigation when thermal zone is disabled
From: Daniel Lezcano
Date: Wed Jan 19 2022 - 14:12:17 EST
Hi Manaf,
On 19/01/2022 20:05, Manaf Meethalavalappu Pallikunhi wrote:
> Hi Rafael/Daniel,
>
> Could you please check and comment ?
It is in my todo list, I'll review it before the end of the week.
Regards
-- Daniel
> On 1/11/2022 2:15 AM, Manaf Meethalavalappu Pallikunhi wrote:
>> Hi Thara,
>>
>> On 1/10/2022 11:25 PM, Thara Gopinath wrote:
>>> Hi Manaf,
>>>
>>> On 1/7/22 1:56 PM, Manaf Meethalavalappu Pallikunhi wrote:
>>>> Whenever a thermal zone is in trip violated state, there is a chance
>>>> that the same thermal zone mode can be disabled either via thermal
>>>> core API or via thermal zone sysfs. Once it is disabled, the framework
>>>> bails out any re-evaluation of thermal zone. It leads to a case where
>>>> if it is already in mitigation state, it will stay the same state
>>>> until it is re-enabled.
>>>>
>>>> To avoid above mentioned issue, on thermal zone disable request
>>>> reset thermal zone and clear mitigation for each trip explicitly.
>>>>
>>>> Signed-off-by: Manaf Meethalavalappu Pallikunhi
>>>> <quic_manafm@xxxxxxxxxxx>
>>>> ---
>>>> drivers/thermal/thermal_core.c | 12 ++++++++++--
>>>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/thermal/thermal_core.c
>>>> b/drivers/thermal/thermal_core.c
>>>> index 51374f4..e288c82 100644
>>>> --- a/drivers/thermal/thermal_core.c
>>>> +++ b/drivers/thermal/thermal_core.c
>>>> @@ -447,10 +447,18 @@ static int thermal_zone_device_set_mode(struct
>>>> thermal_zone_device *tz,
>>>> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>> - if (mode == THERMAL_DEVICE_ENABLED)
>>>> + if (mode == THERMAL_DEVICE_ENABLED) {
>>>> thermal_notify_tz_enable(tz->id);
>>>> - else
>>>> + } else {
>>>> + int trip;
>>>> +
>>>> + /* make sure all previous throttlings are cleared */
>>>> + thermal_zone_device_init(tz);
>>>
>>> It looks weird to do a init when you are actually disabling the
>>> thermal zone.
>>>
>>>
>>>> + for (trip = 0; trip < tz->trips; trip++)
>>>> + handle_thermal_trip(tz, trip);
>>>
>>> So this is exactly what thermal_zone_device_update does except that
>>> thermal_zone_device_update checks for the mode and bails out if the
>>> zone is disabled.
>>> This will work because as you explained in v2, the temperature is
>>> reset in thermal_zone_device_init and handle_thermal_trip will remove
>>> the mitigation if any.
>>>
>>> My two cents here (Rafael and Daniel can comment more on this).
>>>
>>> I think it will be cleaner if we can have a third mode
>>> THERMAL_DEVICE_DISABLING and have thermal_zone_device_update handle
>>> clearing the mitigation. So this will look like
>>> if (mode == THERMAL_DEVICE_DISABLED)
>>> tz->mode = THERMAL_DEVICE_DISABLING;
>>> else
>>> tz->mode = mode;
>>>
>>> thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>>>
>>> if (mode == THERMAL_DEVICE_DISABLED)
>>> tz->mode = mode;
>>>
>>> You will have to update update_temperature to set tz->temperature =
>>> THERMAL_TEMP_INVALID and thermal_zone_set_trips to set
>>> tz->prev_low_trip = -INT_MAX and tz->prev_high_trip = INT_MAX for
>>> THERMAL_DEVICE_DISABLING mode.
>>
>> I think just updating above fields doesn't guarantee complete clearing
>> of mitigation for all governors. For step_wise governor, to make sure
>> mitigation removed completely, we have to set each
>> thermal-instance->initialized = false as well.
>>
>> If we add that to above list of variables in update_temperature()
>> under if (mode == THERMAL_DEVICE_DISABLING) , it is same as
>> thermal_zone_device_init function does in current patch. We are just
>> resetting same fields in different place under a new mode, right ?
>>
>> Thanks,
>>
>> Manaf
>>
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog