Re: [Patch v4 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping

From: Thara Gopinath
Date: Fri Nov 01 2019 - 17:04:52 EST


On 11/01/2019 11:47 AM, Ionela Voinescu wrote:
> Hi guys,
>
> On Thursday 31 Oct 2019 at 17:38:25 (+0100), Vincent Guittot wrote:
>> On Thu, 31 Oct 2019 at 17:29, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>>
>>> On 22.10.19 22:34, Thara Gopinath wrote:
>>>> Thermal governors can request for a cpu's maximum supported frequency
>>>> to be capped in case of an overheat event. This in turn means that the
>>>> maximum capacity available for tasks to run on the particular cpu is
>>>> reduced. Delta between the original maximum capacity and capped
>>>> maximum capacity is known as thermal pressure. Enable cpufreq cooling
>>>> device to update the thermal pressure in event of a capped
>>>> maximum frequency.
>>>>
>>>> Signed-off-by: Thara Gopinath <thara.gopinath@xxxxxxxxxx>
>>>> ---
>>>> drivers/thermal/cpu_cooling.c | 31 +++++++++++++++++++++++++++++--
>>>> 1 file changed, 29 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
>>>> index 391f397..2e6a979 100644
>>>> --- a/drivers/thermal/cpu_cooling.c
>>>> +++ b/drivers/thermal/cpu_cooling.c
>>>> @@ -218,6 +218,23 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
>>>> }
>>>>
>>>> /**
>>>> + * update_sched_max_capacity - update scheduler about change in cpu
>>>> + * max frequency.
>>>> + * @policy - cpufreq policy whose max frequency is capped.
>>>> + */
>>>> +static void update_sched_max_capacity(struct cpumask *cpus,
>>>> + unsigned int cur_max_freq,
>>>> + unsigned int max_freq)
>>>> +{
>>>> + int cpu;
>>>> + unsigned long capacity = (cur_max_freq << SCHED_CAPACITY_SHIFT) /
>>>> + max_freq;
>>>> +
>>>> + for_each_cpu(cpu, cpus)
>>>> + update_thermal_pressure(cpu, capacity);
>>>> +}
>>>> +
>>>> +/**
>>>> * get_load() - get load for a cpu since last updated
>>>> * @cpufreq_cdev: &struct cpufreq_cooling_device for this cpu
>>>> * @cpu: cpu number
>>>> @@ -320,6 +337,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
>>>> unsigned long state)
>>>> {
>>>> struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
>>>> + int ret;
>>>>
>>>> /* Request state should be less than max_level */
>>>> if (WARN_ON(state > cpufreq_cdev->max_level))
>>>> @@ -331,8 +349,17 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
>>>>
>>>> cpufreq_cdev->cpufreq_state = state;
>>>>
>>>> - return dev_pm_qos_update_request(&cpufreq_cdev->qos_req,
>>>> - cpufreq_cdev->freq_table[state].frequency);
>>>> + ret = dev_pm_qos_update_request
>>>> + (&cpufreq_cdev->qos_req,
>>>> + cpufreq_cdev->freq_table[state].frequency);
>>>> +
>>>> + if (ret > 0)
>>>> + update_sched_max_capacity
>>>> + (cpufreq_cdev->policy->cpus,
>>>> + cpufreq_cdev->freq_table[state].frequency,
>>>> + cpufreq_cdev->policy->cpuinfo.max_freq);
>>>> +
>>>> + return ret;
>>>> }
>>>>
>>>> /**
>>>>
>>>
>>> Why not getting rid of update_sched_max_capacity() entirely and call
>>> update_thermal_pressure() in cpu_cooling.c directly? Saves one level in
>>> the call chain and would mean less code for this feature.
>>
>> But you add complexity in update_thermal_pressure which now has to
>> deal with a cpumask and to compute some frequency ratio
>> IMHO, it's cleaner to keep update_thermal_pressure simple as it is now
>>
>
> How about removing update_thermal_pressure altogether and doing all the
> work in update_sched_max_capacity? That is, have
> update_sched_max_capacity compute the capped_freq_ratio, do the
> normalization, and set it per_cpu for all CPUs in the frequency domain.
> You'll save some calculations that you're now doing in
> update_thermal_pressure for each cpu and you avoid shifting back and
> forth.

Yes. I can pass the delta to update_thermal_pressure. I will still want
to keep update_thermal_pressure and a per cpu variable in fair.c to
store this.
>
> If you're doing so it would be worth renaming update_sched_max_capacity
> to something like update_sched_thermal_pressure.
Will do.


--
Warm Regards
Thara