Re: [PATCH v1] thermal: gov_step_wise: Go straight to instance->lower when mitigation is over
From: Rafael J. Wysocki
Date: Mon Jun 24 2024 - 14:44:08 EST
On Sat, Jun 22, 2024 at 2:28 PM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> Commit b6846826982b ("thermal: gov_step_wise: Restore passive polling
> management") attempted to fix a Step-Wise thermal governor issue
> introduced by commit 042a3d80f118 ("thermal: core: Move passive polling
> management to the core"), which caused the governor to leave cooling
> devices in high states, by partially revering that commit.
>
> However, this turns out to be insufficient on some systems due to
> interactions between the governor code restored by commit b6846826982b
> and the passive polling management in the thermal core.
>
> For this reason, revert commit b6846826982b and make the governor set
> the target cooling device state to the "lower" one as soon as the zone
> temperature falls below the threshold of the trip point corresponding
> to the given thermal instance, which means that thermal mitigation is
> not necessary any more.
>
> Before this change the "lower" cooling device state would be reached in
> steps through the passive polling mechanism which was questionable for
> three reasons: (1) cooling device were kept in high states when that was
> not necessary (and it could adversely impact performance), (2) it only
> worked for thermal zones with nonzero passive_delay_jiffies value, and
> (3) passive polling belongs to the core and should not be hijacked by
> governors for their internal purposes.
>
> Fixes: b6846826982b ("thermal: gov_step_wise: Restore passive polling management")
> Closes: https://lore.kernel.org/linux-pm/6759ce9f-281d-4fcd-bb4c-b784a1cc5f6e@xxxxxxxxxxxxxxxxxxxxxx
> Reported-by: Jens Glathe <jens.glathe@xxxxxxxxxxxxxxxxxxxxxx>
> Tested-by: Jens Glathe <jens.glathe@xxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> ---
> drivers/thermal/gov_step_wise.c | 23 +++++------------------
> 1 file changed, 5 insertions(+), 18 deletions(-)
>
> Index: linux-pm/drivers/thermal/gov_step_wise.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/gov_step_wise.c
> +++ linux-pm/drivers/thermal/gov_step_wise.c
> @@ -55,7 +55,11 @@ static unsigned long get_target_state(st
> if (cur_state <= instance->lower)
> return THERMAL_NO_TARGET;
>
> - return clamp(cur_state - 1, instance->lower, instance->upper);
> + /*
> + * If 'throttle' is false, no mitigation is necessary, so
> + * request the lower state for this instance.
> + */
> + return instance->lower;
> }
>
> return instance->target;
> @@ -93,23 +97,6 @@ static void thermal_zone_trip_update(str
> if (instance->initialized && old_target == instance->target)
> continue;
>
> - if (trip->type == THERMAL_TRIP_PASSIVE) {
> - /*
> - * If the target state for this thermal instance
> - * changes from THERMAL_NO_TARGET to something else,
> - * ensure that the zone temperature will be updated
> - * (assuming enabled passive cooling) until it becomes
> - * THERMAL_NO_TARGET again, or the cooling device may
> - * not be reset to its initial state.
> - */
> - if (old_target == THERMAL_NO_TARGET &&
> - instance->target != THERMAL_NO_TARGET)
> - tz->passive++;
> - else if (old_target != THERMAL_NO_TARGET &&
> - instance->target == THERMAL_NO_TARGET)
> - tz->passive--;
> - }
> -
> instance->initialized = true;
>
> mutex_lock(&instance->cdev->lock);
>
If there is no feedback, I'm going to assume that this is fine with everybody.
Thanks!