Re: [RESEND PATCH v3 2/2] thermal: core: introduce thermal zone device mode control
From: Enric Balletbo i Serra
Date: Wed Jul 04 2018 - 06:36:50 EST
Hi Matthias,
Sorry for late reply, my memory is bad so I need to look at this again. The
patch was send some time ago and there are pending changes to do but then I
switched. I'll take a look, but did you saw why this patch was not merged [1]?
Maybe that could answer some of your questions.
Best regards,
Enric
[1] https://lkml.org/lkml/2018/2/27/910
On 03/07/18 19:13, Matthias Kaehlcke wrote:
> On Thu, Jun 28, 2018 at 05:33:02PM -0700, Matthias Kaehlcke wrote:
>> Hi,
>>
>> I stumbled across this patch since I'm currently poking around with
>> early thermal bringup on a platform and this patch has been integrated
>> in our development tree.
>>
>> I'm seeing some unexpected behaviors, which could entirely due to
>> wrong expectation from my side. I only have some basic working
>> knowledge of the thermal framework, just want to double check and
>> perhaps learn a thing or two.
>>
>> On Mon, Feb 26, 2018 at 03:41:18PM +0100, Enric Balletbo i Serra wrote:
>>> From: Zhang Rui <rui.zhang@xxxxxxxxx>
>>>
>>> Thermal "mode" sysfs attribute is introduced to enable/disable a thermal
>>> zone, and .get_mode()/.set_mode() callback is introduced for platform
>>> thermal driver to enable/disable the hardware thermal control logic. And
>>> thermal core takes no action upon thermal zone enable/disable.
>>>
>>> Actually, this is not quite right because thermal core still pokes those
>>> disabled thermal zones occasionally, e.g. upon system resume.
>>>
>>> To fix this, a new flag 'mode' is introduced in struct thermal_zone_device
>>> to represent the thermal zone mode, and several decisions have been made
>>> based on this flag, including
>>> 1. check the thermal zone mode right after it's registered.
>>> 2. skip updating thermal zone if the zone is disabled
>>> 3. stop the polling timer when the thermal zone is disabled
>>>
>>> Note: basically, this patch doesn't affect the existing always-enabled
>>> thermal zones much, with just one exception -
>>> thermal zone .get_mode() must be well prepared to reflect the real thermal
>>> zone status upon the thermal zone registration.
>>
>> From my perspective this looks like a pretty significant change. For
>> the platform I'm working on I added a thermal zone to the device tree,
>> with the expectation that it would be enabled. Judging from the code
>> without this patch this expectation seems to be naive, since
>> of-thermal.c sets tz->mode to THERMAL_DEVICE_DISABLED, so apparently
>> either userspace or some driver should call _set_mode(tz,
>> THERMAL_DEVICE_ENABLED). However even without this the thermal zone
>> appears to be active (I didn't really test end-to-end yet, but at
>> least thermal_zone_device_update() is called and calls
>> handle_thermal_trip()). Not sure why this is the case with
>> THERMAL_DEVICE_DISABLED, but before I learned about the existence of
>> the flag my expectation was that the zone would be enabled.
>>
>> With this patch thermal_zone_device_update() is skipped if the zone
>> hasn't been explictly enabled, which may be consistent with the state
>> of 'tz->mode', but effectively changes the previous/current behavior.
>>
>> Not sure if I'm just dumbly overlooking something obvious or if there
>> is an actual problem with of_thermal (and maybe others).
>
> The problem is that there are now two 'mode' fields, tzd->mode and the
> other typically tzd->devdata->mode, and tzd->mode is never set to enabled.
>
>> thermal zone .get_mode() must be well prepared to reflect the real thermal
>> zone status upon the thermal zone registration.
>
> For of_thermal tzd->mode is initialized with the result of .get_mode()
> when the zone is registered. At this time no sensor has been added
> to the zone, hence the zone is disabled. When a sensor is added later
> tzd->devdata->mode is set to enabled, however tzd->mode remains disabled:
>
> tzd->mode = DISABLED
> tzd->devdata->mode = DISABLED
>
> of_parse_thermal_zones
> thermal_zone_device_register
> tzd->mode = tzd->get_mode() // => DISABLED
>
> <sensor>_probe
> thermal_zone_of_sensor_register
> tzd->set_mode(THERMAL_DEVICE_ENABLED)
> tzd->devdata->mode = ENABLED
>
> One way to fix this for of_thermal could be to setting tzd->mode in
> .set_mode() in addition to setting tzd->devdata->mode. However this
> feels like a workaround/hack. Personally I find it confusing to have
> two mode fields for a thermal zone device. Maybe tzd->mode should
> replace tzd->devdata->mode?
>