Re: [PATCH v3] thermal: core: Call monitor_thermal_zone() if zone temperature is invalid
From: Rafael J. Wysocki
Date: Thu Jul 04 2024 - 10:23:27 EST
Hi,
On Thu, Jul 4, 2024 at 2:52 PM Neil Armstrong <neil.armstrong@xxxxxxxxxx> wrote:
>
> Hi,
>
> On 04/07/2024 14:49, Daniel Lezcano wrote:
> > On 04/07/2024 13:46, Rafael J. Wysocki wrote:
> >> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >>
> >> Commit 202aa0d4bb53 ("thermal: core: Do not call handle_thermal_trip()
> >> if zone temperature is invalid") caused __thermal_zone_device_update()
> >> to return early if the current thermal zone temperature was invalid.
> >>
> >> This was done to avoid running handle_thermal_trip() and governor
> >> callbacks in that case which led to confusion. However, it went too
> >> far because monitor_thermal_zone() still needs to be called even when
> >> the zone temperature is invalid to ensure that it will be updated
> >> eventually in case thermal polling is enabled and the driver has no
> >> other means to notify the core of zone temperature changes (for example,
> >> it does not register an interrupt handler or ACPI notifier).
> >>
> >> Also if the .set_trips() zone callback is expected to set up monitoring
> >> interrupts for a thermal zone, it needs to be provided with valid
> >> boundaries and that can only be done if the zone temperature is known.
> >>
> >> Accordingly, to ensure that __thermal_zone_device_update() will
> >> run again after a failing zone temperature check, make it call
> >> monitor_thermal_zone() regardless of whether or not the zone
> >> temperature is valid and make the latter schedule a thermal zone
> >> temperature update if the zone temperature is invalid even if
> >> polling is not enabled for the thermal zone (however, if this
> >> continues to fail, give up after some time).
> >
> > Rafael,
> >
> > do we agree that we should fix somehow the current issue in this way because we are close to the merge window, but the proper fix is not doing that ?
>
> I've tested this patch, but I have no opinion about it.
>
> I sent https://lore.kernel.org/all/20240704-topic-sm8x50-upstream-fix-battmgr-temp-tz-warn-v1-1-9d66d6f6efde@xxxxxxxxxx/ which
> fixes the warning print, leaving the option for thermal core to update the tz once it becomes available,
> which is the initial goal of this patchset.
Thank you!
I gather that I can use the v2 of the $subject patch without worrying
about the problem you have reported.