Re: [PATCH v2 1/3] sched/uclamp: Set max_spare_cap_cpu even if max_spare_cap is 0
From: Vincent Guittot
Date: Wed Feb 22 2023 - 06:00:12 EST
On Tue, 21 Feb 2023 at 13:05, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> On 02/20/23 18:02, Vincent Guittot wrote:
> > On Sat, 11 Feb 2023 at 18:28, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
> > >
> > > On 02/07/23 10:45, Vincent Guittot wrote:
> > > > On Sun, 5 Feb 2023 at 23:43, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
> > > > >
> > > > > When uclamp_max is being used, the util of the task could be higher than
> > > > > the spare capacity of the CPU, but due to uclamp_max value we force fit
> > > > > it there.
> > > > >
> > > > > The way the condition for checking for max_spare_cap in
> > > > > find_energy_efficient_cpu() was constructed; it ignored any CPU that has
> > > > > its spare_cap less than or _equal_ to max_spare_cap. Since we initialize
> > > > > max_spare_cap to 0; this lead to never setting max_spare_cap_cpu and
> > > > > hence ending up never performing compute_energy() for this cluster and
> > > > > missing an opportunity for a better energy efficient placement to honour
> > > > > uclamp_max setting.
> > > > >
> > > > > max_spare_cap = 0;
> > > > > cpu_cap = capacity_of(cpu) - task_util(p); // 0 if task_util(p) is high
> > > > >
> > > > > ...
> > > > >
> > > > > util_fits_cpu(...); // will return true if uclamp_max forces it to fit
> > > > >
> > > > > ...
> > > > >
> > > > > // this logic will fail to update max_spare_cap_cpu if cpu_cap is 0
> > > > > if (cpu_cap > max_spare_cap) {
> > > > > max_spare_cap = cpu_cap;
> > > > > max_spare_cap_cpu = cpu;
> > > > > }
> > > > >
> > > > > prev_spare_cap suffers from a similar problem.
> > > > >
> > > > > Fix the logic by converting the variables into long and treating -1
> > > > > value as 'not populated' instead of 0 which is a viable and correct
> > > > > spare capacity value.
> > > > >
> > > > > Fixes: 1d42509e475c ("sched/fair: Make EAS wakeup placement consider uclamp restrictions")
> > > > > Signed-off-by: Qais Yousef (Google) <qyousef@xxxxxxxxxxx>
> > > > > ---
> > > > > kernel/sched/fair.c | 9 ++++-----
> > > > > 1 file changed, 4 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > > index c6c8e7f52935..7a21ee74139f 100644
> > > > > --- a/kernel/sched/fair.c
> > > > > +++ b/kernel/sched/fair.c
> > > > > @@ -7382,11 +7382,10 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
> > > > > for (; pd; pd = pd->next) {
> > > > > unsigned long util_min = p_util_min, util_max = p_util_max;
> > > > > unsigned long cpu_cap, cpu_thermal_cap, util;
> > > > > - unsigned long cur_delta, max_spare_cap = 0;
> > > > > + long prev_spare_cap = -1, max_spare_cap = -1;
> > > > > unsigned long rq_util_min, rq_util_max;
> > > > > - unsigned long prev_spare_cap = 0;
> > > > > + unsigned long cur_delta, base_energy;
> > > > > int max_spare_cap_cpu = -1;
> > > > > - unsigned long base_energy;
> > > > > int fits, max_fits = -1;
> > > > >
> > > > > cpumask_and(cpus, perf_domain_span(pd), cpu_online_mask);
> > > > > @@ -7461,7 +7460,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
> > > > > }
> > > > > }
> > > > >
> > > > > - if (max_spare_cap_cpu < 0 && prev_spare_cap == 0)
> > > > > + if (max_spare_cap_cpu < 0 && prev_spare_cap < 0)
> > > > > continue;
> > > > >
> > > > > eenv_pd_busy_time(&eenv, cpus, p);
> > > > > @@ -7469,7 +7468,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
> > > > > base_energy = compute_energy(&eenv, pd, cpus, p, -1);
> > > > >
> > > > > /* Evaluate the energy impact of using prev_cpu. */
> > > > > - if (prev_spare_cap > 0) {
> > > > > + if (prev_spare_cap > -1) {
> > > > > prev_delta = compute_energy(&eenv, pd, cpus, p,
> > > > > prev_cpu);
> > > > > /* CPU utilization has changed */
> > > >
> > > > I think that you also need the change below to make sure that the
> > > > signed comparison will be used. I have quickly checked the assembly
> > > > code for aarch64 and your patch keeps using unsigned comparison (b.ls)
> > > > ((fits == max_fits) && (cpu_cap > max_spare_cap))) {
> > > > ffff8000080e4c94: f94067e0 ldr x0, [sp, #200]
> > > > ffff8000080e4c98: eb00003f cmp x1, x0
> > > > ffff8000080e4c9c: 54ff98a9 b.ls ffff8000080e3fb0
> > > > <select_task_rq_fair+0x570> // b.plast
> > > >
> > > > Whereas the change below make it to use the signed version (b.le)
> > > > ((fits == max_fits) && ((long)cpu_cap > max_spare_cap))) {
> > > > ffff8000080e4c94: f94067e0 ldr x0, [sp, #200]
> > > > ffff8000080e4c98: eb00003f cmp x1, x0
> > > > ffff8000080e4c9c: 54ff98ad b.le ffff8000080e3fb0 <select_task_rq_fair+0x570>
> > > >
> > > > -- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -7522,7 +7522,7 @@ static int find_energy_efficient_cpu(struct
> > > > task_struct *p, int prev_cpu)
> > > > prev_spare_cap = cpu_cap;
> > > > prev_fits = fits;
> > > > } else if ((fits > max_fits) ||
> > > > - ((fits == max_fits) && (cpu_cap >
> > > > max_spare_cap))) {
> > > > + ((fits == max_fits) &&
> > > > ((long)cpu_cap > max_spare_cap))) {
> > > > /*
> > > > * Find the CPU with the maximum spare capacity
> > > > * among the remaining CPUs in the performance
> > >
> > > Isn't it better to go back to v1 form then? The inconsistent type paired with
> > > the cast isn't getting too ugly for me :(
> >
> > the cast into a long of the cpu capacity in the condition was a good
> > way to fix this unsigned/signed comparison and make is consistent with
> > the use of -1 as default value IMO
> > ((long)cpu_cap > max_spare_cap)
>
> Fair enough. Our boolean brains differ :-) I'll add the cast.
>
> Do you see the energy calculation issue Dietmar was trying to highlight? As it
> stands I only have boolean tweaks to do for v3.
I haven't looked too much at uclamp_max impact in energy calculation.
Nevertheless, I wonder if one solution could be to not clamp the
utilization to cpu max capacity in this case. The fact that
utilization can go above cpu capacity when we clamp its frequency
reflect that it would need more compute capacity or it will run
longer. I will try to look more deeply in this use case
>
>
> Cheers
>
> --
> Qais Yousef
>
> > >
> > > I don't think we can convert cpu_cap to long without having to do more work as
> > > it is used with 'util'.
> > >
> > >
> > > Cheers
> > >
> > > --
> > > Qais Yousef