Re: [PATCH v3 1/8] thermal: core: Add mechanism for connecting trips with driver data

From: Rafael J. Wysocki
Date: Fri Aug 04 2023 - 07:14:35 EST


On Fri, Aug 4, 2023 at 10:17 AM Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:
>
> On 03/08/2023 21:58, Rafael J. Wysocki wrote:
> > On Thu, Aug 3, 2023 at 6:20 PM Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
> >>
> >> On 03/08/2023 16:15, Rafael J. Wysocki wrote:
> >>> On Thu, Aug 3, 2023 at 3:06 PM Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
> >>>>
> >>>> On 02/08/2023 18:48, Rafael J. Wysocki wrote:
> >>>>
> >>>> [ ... ]
> >>>>
> >>>>>> Let me check if I can do something on top of your series to move it in
> >>>>>> the ACPI driver.
> >>>>>
> >>>>> It doesn't need to be on top of my series, so if you have an idea,
> >>>>> please just let me know what it is.
> >>>>>
> >>>>> It can't be entirely in the ACPI driver AFAICS, though, because
> >>>>> trips[i] need to be modified on updates and they belong to the core.
> >>>>> Hence, the driver needs some help from the core to get to them. It
> >>>>> can be something like "this is my trip tag and please give me the
> >>>>> address of the trip matching it" or similar, but it is needed, because
> >>>>> the driver has to assume that the trip indices used by it initially
> >>>>> may change.
> >>>>
> >>>> May be I'm missing something but driver_ref does not seems to be used
> >>>> except when assigning it, no?
> >>>
> >>> It is used on the other side. That is, the value assigned to the trip
> >>> field in it is accessed via trip_ref in the driver.
> >>>
> >>> The idea is that the driver puts a pointer to its local struct
> >>> thermal_trip_ref into a struct thermal_trip and the core stores the
> >>> address of that struct thermal_trip in there, which allows the driver
> >>> to access the struct thermal_trip via its local struct
> >>> thermal_trip_ref going forward.
> >>>
> >>> Admittedly, this is somewhat convoluted.
> >>>
> >>> I have an alternative approach in the works, just for illustration
> >>> purposes if nothing else, but I have encountered a problem that I
> >>> would like to ask you about.
> >>>
> >>> Namely, zone disabling is not particularly useful for preventing the
> >>> zone from being used while the trips are updated, because it has side
> >>> effects. First, it triggers __thermal_zone_device_update() and a
> >>> netlink message every time the mode changes, which can be kind of
> >>> overcome.
> >>
> >> Right
> >>
> >>> But second, if the mode is "disabled", it does not actually
> >>> prevent things like __thermal_zone_get_trip() from running and the
> >>> zone lock is the only thing that can be used for that AFAICS.
> >> >
> >>> So by "disabling" a thermal zone, did you mean changing its mode to
> >>> "disabled" or something else?
> >>
> >> Yes, that is what I meant.
> >>
> >> May be the initial proposal by updating the thermal trips pointer can
> >> solve that [1]
> >
> > No, it can't. An existing trips[] table cannot be replaced with a new
> > one with different trip indices, because those indices are already in
> > use. And if the indices are the same, there's no reason to replace
> > trips.
> >
> >> IMO we can assume the trip point changes are very rare (if any), so
> >> rebuilding a new trip array and update the thermal zone with the pointer
> >> may solve the situation.
> >>
> >> The routine does a copy of the trips array, so it can reorder it without
> >> impacting the array passed as a parameter. And it can take the lock.
> >
> > The driver can take a lock as well. Forbidding drivers to use the
> > zone lock is an artificial limitation without technical merit IMV.
>
> Yes, it is technically possible to take a lock from a driver. However,
> from a higher perspective, we have a core framework which is
> self-contained and we have a back-end which forces us to export this lock.
>
> Even if it is possible, it is not desirable because we break the
> self-containment and thus that will make future changes in the core
> framework complicated because of the interactions with back-end drivers.

So the counter argument here is that using the zone lock directly in
the driver is the most straightforward way of addressing the use case
at hand, which is the need to update trip points in a non-racy way.
Everything else is more complex and the reasons for adding the extra
complexity can be questioned.

Self-containment is nice, but in some cases it is just not worth
enforcing it all the way through at the cost of increased code
complexity.

Anyway, I'm going to post a new version of this patch series later
today which uses a somewhat different approach. It is a bit more
complex, but maybe this is fine.