Re: [PATCH v3 5/8] ACPI: thermal: Hold thermal zone lock around trip updates
From: Rafael J. Wysocki
Date: Tue Aug 01 2023 - 14:52:21 EST
On Tue, Aug 1, 2023 at 8:39 PM Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
>
>
> Hi Rafael,
>
> On 25/07/2023 14:16, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > There is a race condition between acpi_thermal_trips_update() and
> > acpi_thermal_check_fn(), because the trip points may get updated while
> > the latter is running which in theory may lead to inconsistent results.
> > For example, if two trips are updated together, using the temperature
> > value of one of them from before the update and the temperature value
> > of the other one from after the update may not lead to the expected
> > outcome.
> >
> > To address this, make acpi_thermal_trips_update() hold the thermal zone
> > lock across the entire update of trip points.
>
> As commented in patch 3/8, having a driver locking a thermal core
> structure is not right and goes to the opposite direction of the recent
> cleanups.
It already happens though, because thermal_zone_device_update() locks
the zone and it is called by the driver.
> Don't we have 2 race conditions:
>
> acpi_thermal_trips_update() + thermal_zone_device_check()
>
> acpi_thermal_trips_update() + acpi_thermal_trips_update()
I'm not sure what you mean.
First off, acpi_thermal_check_fn() needs to be locked against anything
using the trips in the zone's trips[] table, in particular
thermal_get_trend().
However, thermal_get_trend() also uses the driver's private trips
information, so it needs to be locked against
acpi_thermal_trips_update().
And obviously the latter needs to be locked against acpi_thermal_check_fn().
> For the former, we can disable the thermal zone, update and then enable
Disabling the thermal zone is an idea, but it would be necessary to do
that in both acpi_thermal_check_fn() and acpi_thermal_trips_update().
Also I'm not sure how different that would be from holding the zone
lock across the updates.
Moreover, acpi_thermal_trips_update() would then need to hold the
local lock around the thermal zone disable/enable which would be way
uglier than just using the zone lock directly in it.