Re: [PATCH] platform/x86: asus-wmi: Re-enable custom fan curves after setting throttle_thermal_policy

From: Hans de Goede
Date: Mon Jan 22 2024 - 05:54:14 EST


Hi Luke,

On 1/16/24 20:43, Luke Jones wrote:
>
>
> On Tue, Jan 16 2024 at 11:25:41 +01:00:00, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>> Hi Luke,
>>
>> On 1/15/24 21:25, Luke Jones wrote:
>>>
>>>
>>>  On Mon, Jan 15 2024 at 13:38:16 +01:00:00, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>>>  Hi,
>>>>
>>>>  On 1/15/24 13:22, Andrei Sabalenka wrote:
>>>>>   When changing throttle_thermal_policy, all the custom fan curves are getting disabled. This patch re-enables all the custom fan curves that were enabled before changing throttle_thermal_policy.
>>>>>
>>>>>   I believe it makes asus-wmi sysfs interface more convenient, as it allows userspace to manage fan curves independently from platform_profile and throttle_thermal_policy. At the kernel level, custom fan curves should not be tied to "power profiles" scheme in any way, as it gives the user less freedom of controlling them.
>>>>
>>>>  Setting a high performance power-profile typically also involves ramping up
>>>>  the fans harder. So I don't think this patch is a good idea.
>>>>
>>>>  If you really want this behavior then you can always re-enable the custom
>>>>  curve after changing the profile.
>>>>
>>>>  Luke, do you have any opinion on this?
>>>
>>>  I see some misconceptions that should be addressed:
>>>  1. ASUS themselves set separate fan curves per "platform profile", both standard and custom
>>>  2. fan curves are not tied to platform profiles, they are tied to the throttle_thermal_policy, and this is actually done in the acpi - so the code here is a mirror of that
>>>  3. platform-profiles are tied to throttle_thermal_policy
>>>
>>>  There is no lack of user control at all, a decent tool (like asusctl) can set fan curves without issues but it's perhaps not convenient for manually setting via a script etc.
>>>
>>>  The main reason that a curve is disabled for the policy being switched to is for safety. It was a paranoid choice I made at the time. The kernel (and acpi) can't guarantee that a user set a reasonable default for that policy so the safest thing is to force an explicit re-enable of it.
>>>
>>>  Having said that: I know that the curve was previously set for that profile/policy and in theory should be fine plus it is already used by the user, it is also not possible to set a curve for a different profile to the one a user is currently in -  this is forced in ACPI as you can set only the curve for the profile you are in (the kernel code also mirrors this).
>>>
>>>  So this patch should be fine.
>>>
>>>  Signed-off-by: Luke D. Jones <luke@xxxxxxxxxx>
>>
>> So I just checked asus-wmi.c again and there seems to be only 1 custom
>> curve per fan, one curve for CPU one for GPU and one for MID.
>
> I misread sorry. Yes this is correct. The ACPI only allows fetching the defaults for the currently loaded profile so this was a result of that.
>
>> And while the custom curve may be fine for e.g. low-power mode,
>> that same custom curve may lead to overheating/throttling with
>> performance mode since performance mode typically requires
>> higher fan speeds.
>>
>> As you write yourself: 'ASUS themselves set separate fan curves per
>> "platform profile", both standard and custom', but there is only 1
>> custom/user curve (in the kernel), not 1 per platform-profile.
>>
>> So IMHO disabling the custom curve on profile switching is
>> the correct thing to do. Then userspace can do something like:
>>
>
> Yes agreed. And that is indeed why I set them to off originally when changing profile.
>
>> 1. Have per platform-profile custom curves in some tool
>> 2. Have that tool change (or monitor) platform-profile
>> 3. Load new custom profile based on the new platform-profile
>> 4. Enable the new (fitting to the new platform-profile)
>>    custom fan curve.
>>
>> I also see that fan_curve_get_factory_default() retrieves the
>> defaults for a *specific* thermal-policy / platform-profile
>>
>> So if a user somehow just enables custom-fancurves without
>> actually changing the curve then this patch would lead
>> to the following scenario:
>>
>> 1. Driver loads, lets assume the system boots in balanced
>> mode, balanced factory-default fan-curve is now loaded into
>> the custom fan-curve by fan_curve_check_present()
>>
>> 2. User calls fan_curve_enable_store() writing "1", because
>> reasons.
>>
>> 3. User changes platform-profile to performance,
>> throttle_thermal_policy_write() calls asus_wmi_set_devstate(
>> ASUS_WMI_DEVID_THROTTLE_THERMAL_POLICY) and the EC
>> sets fan curve to performance factory-default fan-curve.
>>
>> 4. Next throttle_thermal_policy_write() will now call
>> fan_curve_write() restoring the balanced factory-default
>> fan-curve even though we are in performance mode now.
>>
>> This seems undesirable to me.
>>
>> Restoring custom fan-curves automatically on platform-profile
>> change IMHO requires also storing a separate custom curve
>> per profile inside the kernel and populating all custom
>> curves with the factory defaults at boot. If I read what
>> you have written above this would also actually match
>> what you wrote above about ASUS using separate custom curves
>> per profile. If ASUS uses separate custom curves per profile
>> then IMHO so should Linux.
>
> This is correct yes.
>
>>
>> Note custom fan-curves per profile still means that the custom
>> curve will be overwritten when changing profiles, some new sysfs
>> interface would be necessary to write the non-active custom
>> curves so that the restored curve on profile switch can be
>> custom too on the first switch.
>>
>> (rather then having to switch to be able to write the custom
>> curve for a profile other then the currently active profile).
>>
>> Note this is not a 100% hard nack for this patch, but atm
>> I'm leaning towards a nack.
>
> I revert my signed-off. This is a nack. Everything a user may want can be done in userspace.

Ok, I'm dropping this patch from the platfrom-driver-x86 patch-queue then.

Regards,

Hans




>>>>>   Signed-off-by: Andrei Sabalenka <mechakotik@xxxxxxxxx>
>>>>>   ---
>>>>>    drivers/platform/x86/asus-wmi.c | 29 ++++++++++++++++++++++-------
>>>>>    1 file changed, 22 insertions(+), 7 deletions(-)
>>>>>
>>>>>   diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
>>>>>   index 18be35fdb..c2e38f6d8 100644
>>>>>   --- a/drivers/platform/x86/asus-wmi.c
>>>>>   +++ b/drivers/platform/x86/asus-wmi.c
>>>>>   @@ -3441,13 +3441,28 @@ static int throttle_thermal_policy_write(struct asus_wmi *asus)
>>>>>            return -EIO;
>>>>>        }
>>>>>
>>>>>   -    /* Must set to disabled if mode is toggled */
>>>>>   -    if (asus->cpu_fan_curve_available)
>>>>>   -        asus->custom_fan_curves[FAN_CURVE_DEV_CPU].enabled = false;
>>>>>   -    if (asus->gpu_fan_curve_available)
>>>>>   -        asus->custom_fan_curves[FAN_CURVE_DEV_GPU].enabled = false;
>>>>>   -    if (asus->mid_fan_curve_available)
>>>>>   -        asus->custom_fan_curves[FAN_CURVE_DEV_MID].enabled = false;
>>>>>   +    /* Re-enable fan curves after profile change */
>>>>>   +    if (asus->cpu_fan_curve_available && asus->custom_fan_curves[FAN_CURVE_DEV_CPU].enabled) {
>>>>>   +        err = fan_curve_write(asus, &asus->custom_fan_curves[FAN_CURVE_DEV_CPU]);
>>>>>   +        if (err) {
>>>>>   +            pr_warn("Failed to re-enable CPU fan curve: %d\n", err);
>>>>>   +            return err;
>>>>>   +        }
>>>>>   +    }
>>>>>   +    if (asus->gpu_fan_curve_available && asus->custom_fan_curves[FAN_CURVE_DEV_GPU].enabled) {
>>>>>   +        err = fan_curve_write(asus, &asus->custom_fan_curves[FAN_CURVE_DEV_GPU]);
>>>>>   +        if (err) {
>>>>>   +            pr_warn("Failed to re-enable GPU fan curve: %d\n", err);
>>>>>   +            return err;
>>>>>   +        }
>>>>>   +    }
>>>>>   +    if (asus->mid_fan_curve_available && asus->custom_fan_curves[FAN_CURVE_DEV_MID].enabled) {
>>>>>   +        err = fan_curve_write(asus, &asus->custom_fan_curves[FAN_CURVE_DEV_MID]);
>>>>>   +        if (err) {
>>>>>   +            pr_warn("Failed to re-enable MID fan curve: %d\n", err);
>>>>>   +            return err;
>>>>>   +        }
>>>>>   +    }
>>>>>
>>>>>        return 0;
>>>>>    }
>>>>
>>>
>>>
>>
>
>