Re: [PATCH v1 13/17] mlxsw: core_thermal: Use the .should_bind() thermal zone callback

From: Rafael J. Wysocki
Date: Wed Jul 31 2024 - 09:02:43 EST


On Wed, Jul 31, 2024 at 2:43 PM Ido Schimmel <idosch@xxxxxxxxxx> wrote:
>
> On Tue, Jul 30, 2024 at 08:34:45PM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >
> > Make the mlxsw core_thermal driver use the .should_bind() thermal zone
> > callback to provide the thermal core with the information on whether or
> > not to bind the given cooling device to the given trip point in the
> > given thermal zone. If it returns 'true', the thermal core will bind
> > the cooling device to the trip and the corresponding unbinding will be
> > taken care of automatically by the core on the removal of the involved
> > thermal zone or cooling device.
> >
> > It replaces the .bind() and .unbind() thermal zone callbacks (in 3
> > places) which assumed the same trip points ordering in the driver
> > and in the thermal core (that may not be true any more in the
> > future). The .bind() callbacks used loops over trip point indices
> > to call thermal_zone_bind_cooling_device() for the same cdev (once
> > it had been verified) and all of the trip points, but they passed
> > different 'upper' and 'lower' values to it for each trip.
> >
> > To retain the original functionality, the .should_bind() callbacks
> > need to use the same 'upper' and 'lower' values that would be used
> > by the corresponding .bind() callbacks when they are about about to
>
> Nit: s/about about/about/

Yes, thanks!

> > return 'true'. To that end, the 'priv' field of each trip is set
> > during the thermal zone initialization to point to the corresponding
> > 'state' object containing the maximum and minimum cooling states of
> > the cooling device.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Please see more comments below, but this patch is going to conflict with
> the series at [1] which is currently under review. How do you want to
> handle that?
>
> https://lore.kernel.org/netdev/cover.1722345311.git.petrm@xxxxxxxxxx/

I may be missing something, but I don't see conflicts between this
patch and the series above that would be hard to resolve at merge
time.

Anyway, I'll try to apply the above series locally and merge it with
this patch, thanks for the heads up!

> > ---
> >
> > This patch only depends on patch [09/17].
> >
> > ---
> > drivers/net/ethernet/mellanox/mlxsw/core_thermal.c | 121 +++++----------------
> > 1 file changed, 34 insertions(+), 87 deletions(-)
> >
> > Index: linux-pm/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
> > ===================================================================
> > --- linux-pm.orig/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
> > +++ linux-pm/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
> > @@ -165,52 +165,22 @@ static int mlxsw_get_cooling_device_idx(
> > return -ENODEV;
> > }
> >
> > -static int mlxsw_thermal_bind(struct thermal_zone_device *tzdev,
> > - struct thermal_cooling_device *cdev)
> > +static bool mlxsw_thermal_should_bind(struct thermal_zone_device *tzdev,
> > + const struct thermal_trip *trip,
> > + struct thermal_cooling_device *cdev,
> > + struct cooling_spec *c)
> > {
> > struct mlxsw_thermal *thermal = thermal_zone_device_priv(tzdev);
> > - struct device *dev = thermal->bus_info->dev;
> > - int i, err;
> > + const struct mlxsw_cooling_states *state = trip->priv;
> >
> > /* If the cooling device is one of ours bind it */
> > if (mlxsw_get_cooling_device_idx(thermal, cdev) < 0)
> > - return 0;
> > + return false;
> >
> > - for (i = 0; i < MLXSW_THERMAL_NUM_TRIPS; i++) {
> > - const struct mlxsw_cooling_states *state = &thermal->cooling_states[i];
> > + c->upper = state->max_state;
> > + c->lower = state->min_state;
> >
> > - err = thermal_zone_bind_cooling_device(tzdev, i, cdev,
> > - state->max_state,
> > - state->min_state,
> > - THERMAL_WEIGHT_DEFAULT);
> > - if (err < 0) {
> > - dev_err(dev, "Failed to bind cooling device to trip %d\n", i);
> > - return err;
> > - }
> > - }
> > - return 0;
> > -}
> > -
> > -static int mlxsw_thermal_unbind(struct thermal_zone_device *tzdev,
> > - struct thermal_cooling_device *cdev)
> > -{
> > - struct mlxsw_thermal *thermal = thermal_zone_device_priv(tzdev);
> > - struct device *dev = thermal->bus_info->dev;
> > - int i;
> > - int err;
> > -
> > - /* If the cooling device is our one unbind it */
> > - if (mlxsw_get_cooling_device_idx(thermal, cdev) < 0)
> > - return 0;
> > -
> > - for (i = 0; i < MLXSW_THERMAL_NUM_TRIPS; i++) {
> > - err = thermal_zone_unbind_cooling_device(tzdev, i, cdev);
> > - if (err < 0) {
> > - dev_err(dev, "Failed to unbind cooling device\n");
> > - return err;
> > - }
> > - }
> > - return 0;
> > + return true;
> > }
> >
> > static int mlxsw_thermal_get_temp(struct thermal_zone_device *tzdev,
> > @@ -239,59 +209,29 @@ static struct thermal_zone_params mlxsw_
> > .no_hwmon = true,
> > };
> >
> > -static struct thermal_zone_device_ops mlxsw_thermal_ops = {
> > - .bind = mlxsw_thermal_bind,
> > - .unbind = mlxsw_thermal_unbind,
> > - .get_temp = mlxsw_thermal_get_temp,
> > -};
>
> Is there a reason to move 'mlxsw_thermal_ops' below?

Not really, it can stay here.

> > -
> > -static int mlxsw_thermal_module_bind(struct thermal_zone_device *tzdev,
> > - struct thermal_cooling_device *cdev)
> > +static bool mlxsw_thermal_module_should_bind(struct thermal_zone_device *tzdev,
> > + const struct thermal_trip *trip,
> > + struct thermal_cooling_device *cdev,
> > + struct cooling_spec *c)
> > {
> > struct mlxsw_thermal_module *tz = thermal_zone_device_priv(tzdev);
> > struct mlxsw_thermal *thermal = tz->parent;
> > - int i, j, err;
> > + const struct mlxsw_cooling_states *state = trip->priv;
>
> Please place it between 'tz' and 'thermal'. Networking code tries to
> maintain reverse xmas tree ordering for local variables.

I will.

Thanks for the comments!