Re: [Patch v8 4/7] sched/fair: Enable periodic update of average thermal pressure

From: Vincent Guittot
Date: Mon Jan 27 2020 - 10:15:20 EST


On Mon, 27 Jan 2020 at 13:09, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>
> On 24/01/2020 16:45, Vincent Guittot wrote:
> > On Fri, 24 Jan 2020 at 16:37, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> >>
> >> On 17/01/2020 16:39, Vincent Guittot wrote:
> >>> On Fri, 17 Jan 2020 at 15:55, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >>>>
> >>>> On Fri, Jan 17, 2020 at 02:22:51PM +0100, Vincent Guittot wrote:
> >>>>> On Thu, 16 Jan 2020 at 16:15, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> [...]
>
> >> The 'now' argument is one thing but why not:
> >>
> >> -int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
> >> +int update_thermal_load_avg(u64 now, struct rq *rq)
> >> {
> >> + u64 capacity = arch_cpu_thermal_pressure(cpu_of(rq));
> >> +
> >> if (___update_load_sum(now, &rq->avg_thermal,
> >>
> >> This would make the call-sites __update_blocked_others() and
> >> task_tick(_fair)() cleaner.
> >
> > I prefer to keep the capacity as argument. This is more aligned with
> > others that provides the value of the signal to apply
> >
> >>
> >> I guess the argument is not to pollute pelt.c. But it already contains
> >
> > you've got it. I don't want to pollute the pelt.c file with things not
> > related to pelt but thermal as an example.
> >
> >> arch_scale_[freq|cpu]_capacity() for irq.
>
> But isn't arch_cpu_thermal_pressure() not exactly the same as
> arch_scale_cpu_capacity() and arch_scale_freq_capacity()?
>
> All of them are defined by default within the scheduler code
> [include/linux/sched/topology.h or kernel/sched/sched.h] and can be
> overwritten by arch code with a fast implementation (e.g. returning a
> per-cpu variable).
>
> So why is using arch_scale_freq_capacity() and arch_scale_cpu_capacity()
> in update_irq_load_avg [kernel/sched/pelt.c] and update_rq_clock_pelt()

As explained previously, update_irq_load_avg is an exception and not
the example to follow. update_rt/dl_rq_load_avg are the example to
follow and fixing update_irq_load_avg exception is on my todo list

> [kernel/sched/pelt.h] OK but arch_cpu_thermal_pressure() in
> update_thermal_load_avg() [kernel/sched/pelt.c] not?
>
> Shouldn't arch_cpu_thermal_pressure() not be called
> arch_scale_thermal_capacity() to highlight the fact that those three

Quoted from cover letter https://lkml.org/lkml/2020/1/14/1164:
"
v6->v7:
- ...
- Renamed arch_scale_thermal_capacity to arch_cpu_thermal_pressure
as per review comments from Peter, Dietmar and Ionela.
-...

"

> functions are doing the same thing, scaling capacity by something (cpu,
> frequency or thermal)?