Re: [Patch v4 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping

From: Ionela Voinescu
Date: Mon Nov 04 2019 - 09:41:53 EST


Hi Thara,

On Friday 01 Nov 2019 at 17:04:47 (-0400), Thara Gopinath wrote:
> On 11/01/2019 11:47 AM, Ionela Voinescu wrote:
> > Hi guys,
> >
> > On Thursday 31 Oct 2019 at 17:38:25 (+0100), Vincent Guittot wrote:
> >> On Thu, 31 Oct 2019 at 17:29, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> >>>
> >>> On 22.10.19 22:34, Thara Gopinath wrote:
> >>>> Thermal governors can request for a cpu's maximum supported frequency
> >>>> to be capped in case of an overheat event. This in turn means that the
> >>>> maximum capacity available for tasks to run on the particular cpu is
> >>>> reduced. Delta between the original maximum capacity and capped
> >>>> maximum capacity is known as thermal pressure. Enable cpufreq cooling
> >>>> device to update the thermal pressure in event of a capped
> >>>> maximum frequency.
> >>>>
> >>>> Signed-off-by: Thara Gopinath <thara.gopinath@xxxxxxxxxx>
> >>>> ---
> >>>> drivers/thermal/cpu_cooling.c | 31 +++++++++++++++++++++++++++++--
> >>>> 1 file changed, 29 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> >>>> index 391f397..2e6a979 100644
> >>>> --- a/drivers/thermal/cpu_cooling.c
> >>>> +++ b/drivers/thermal/cpu_cooling.c
> >>>> @@ -218,6 +218,23 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
> >>>> }
> >>>>
> >>>> /**
> >>>> + * update_sched_max_capacity - update scheduler about change in cpu
> >>>> + * max frequency.
> >>>> + * @policy - cpufreq policy whose max frequency is capped.
> >>>> + */
> >>>> +static void update_sched_max_capacity(struct cpumask *cpus,
> >>>> + unsigned int cur_max_freq,
> >>>> + unsigned int max_freq)
> >>>> +{
> >>>> + int cpu;
> >>>> + unsigned long capacity = (cur_max_freq << SCHED_CAPACITY_SHIFT) /
> >>>> + max_freq;
> >>>> +
> >>>> + for_each_cpu(cpu, cpus)
> >>>> + update_thermal_pressure(cpu, capacity);
> >>>> +}
> >>>> +
> >>>> +/**
> >>>> * get_load() - get load for a cpu since last updated
> >>>> * @cpufreq_cdev: &struct cpufreq_cooling_device for this cpu
> >>>> * @cpu: cpu number
> >>>> @@ -320,6 +337,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> >>>> unsigned long state)
> >>>> {
> >>>> struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> >>>> + int ret;
> >>>>
> >>>> /* Request state should be less than max_level */
> >>>> if (WARN_ON(state > cpufreq_cdev->max_level))
> >>>> @@ -331,8 +349,17 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> >>>>
> >>>> cpufreq_cdev->cpufreq_state = state;
> >>>>
> >>>> - return dev_pm_qos_update_request(&cpufreq_cdev->qos_req,
> >>>> - cpufreq_cdev->freq_table[state].frequency);
> >>>> + ret = dev_pm_qos_update_request
> >>>> + (&cpufreq_cdev->qos_req,
> >>>> + cpufreq_cdev->freq_table[state].frequency);
> >>>> +
> >>>> + if (ret > 0)
> >>>> + update_sched_max_capacity
> >>>> + (cpufreq_cdev->policy->cpus,
> >>>> + cpufreq_cdev->freq_table[state].frequency,
> >>>> + cpufreq_cdev->policy->cpuinfo.max_freq);
> >>>> +
> >>>> + return ret;
> >>>> }
> >>>>
> >>>> /**
> >>>>
> >>>
> >>> Why not getting rid of update_sched_max_capacity() entirely and call
> >>> update_thermal_pressure() in cpu_cooling.c directly? Saves one level in
> >>> the call chain and would mean less code for this feature.
> >>
> >> But you add complexity in update_thermal_pressure which now has to
> >> deal with a cpumask and to compute some frequency ratio
> >> IMHO, it's cleaner to keep update_thermal_pressure simple as it is now
> >>
> >
> > How about removing update_thermal_pressure altogether and doing all the
> > work in update_sched_max_capacity? That is, have
> > update_sched_max_capacity compute the capped_freq_ratio, do the
> > normalization, and set it per_cpu for all CPUs in the frequency domain.
> > You'll save some calculations that you're now doing in
> > update_thermal_pressure for each cpu and you avoid shifting back and
> > forth.
>
> Yes. I can pass the delta to update_thermal_pressure. I will still want
> to keep update_thermal_pressure and a per cpu variable in fair.c to
> store this.
> >

Why do you want to keep the variable in fair.c? To me this thermal
pressure delta seems more of a CPU thermal characteristic, not a
CFS characteristic, so I would be tempted to define it and set it
in cpu_cooling.c and let fair.c/pelt.c to be just the consumers of
thermal pressure delta, either directly or through some interface.

What do you think?

Thanks,
Ionela.

> > If you're doing so it would be worth renaming update_sched_max_capacity
> > to something like update_sched_thermal_pressure.
> Will do.
>
>
> --
> Warm Regards
> Thara