Re: [PATCH] sched: Take thermal pressure into account when determine rt fits capacity

From: Xuewen Yan
Date: Mon Apr 11 2022 - 04:52:40 EST


HI Dietmar

On Mon, Apr 11, 2022 at 4:21 PM Dietmar Eggemann
<dietmar.eggemann@xxxxxxx> wrote:
>
> On 07/04/2022 07:19, Xuewen Yan wrote:
> > There are cases when the cpu max capacity might be reduced due to thermal.
> > Take into the thermal pressure into account when judge whether the rt task
> > fits the cpu. And when schedutil govnor get cpu util, the thermal pressure
> > also should be considered.
> >
> > Signed-off-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 1 +
> > kernel/sched/rt.c | 1 +
> > 2 files changed, 2 insertions(+)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 3dbf351d12d5..285ad51caf0f 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -159,6 +159,7 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
> > struct rq *rq = cpu_rq(sg_cpu->cpu);
> > unsigned long max = arch_scale_cpu_capacity(sg_cpu->cpu);
> >
> > + max -= arch_scale_thermal_pressure(sg_cpu->cpu);
>
> max' = arch_scale_cpu_capacity() - arch_scale_thermal_pressure()
>
> For the energy part (A) we use max' in compute_energy() to cap sum_util
> and max_util at max' and to call em_cpu_energy(..., max_util, sum_util,
> max'). This was done to match (B)'s `policy->max` capping.
>
> For the frequency part (B) we have freq_qos_update_request() in:
>
> power_actor_set_power()
> ...
> cdev->ops->set_cur_state()
>
> cpufreq_set_cur_state()
> freq_qos_update_request() <-- !
> arch_update_thermal_pressure()
>
> restricting `policy->max` which then clamps `target_freq` in:
>
> cpufreq_update_util()
> ...
> get_next_freq()
> cpufreq_driver_resolve_freq()
> __resolve_freq()
>

Do you mean that the "max" here will not affect the frequency
conversion, so there is no need to change it?
But is it better to reflect the influence of thermal here?

> [...]
>
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index a32c46889af8..d9982ebd4821 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -466,6 +466,7 @@ static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu)
> > max_cap = uclamp_eff_value(p, UCLAMP_MAX);
> >
> > cpu_cap = capacity_orig_of(cpu);
> > + cpu_cap -= arch_scale_thermal_pressure(cpu);
> >
> > return cpu_cap >= min(min_cap, max_cap);
> > }
>
> IMHO, this should follow what we do with rq->cpu_capacity
> (capacity_of(), the remaining capacity for CFS). E.g. we use
> capacity_of() in find_energy_efficient_cpu() and select_idle_capacity()
> to compare capacities. So we would need a function like
> scale_rt_capacity() for RT (minus the rq->avg_rt.util_avg) but then also
> one for DL (minus rq->avg_dl.util_avg and rq->avg_rt.util_avg).

It's a really good idea. And do you already have the corresponding patch?
If there is, can you tell me the corresponding link?

Thanks a lot!

BR
xuewen