Re: [PATCH v2 2/2] sched/fair: Fix negative energy delta in find_energy_efficient_cpu()

From: Vincent Donnefort
Date: Tue May 04 2021 - 05:27:24 EST


On Thu, Apr 29, 2021 at 11:19:48AM +0100, Pierre.Gondois@xxxxxxx wrote:
> From: Pierre Gondois <Pierre.Gondois@xxxxxxx>
>
> find_energy_efficient_cpu() (feec()) searches the best energy CPU
> to place a task on. To do so, compute_energy() estimates the energy
> impact of placing the task on a CPU, based on CPU and task utilization
> signals.
>
> Utilization signals can be concurrently updated while evaluating a
> performance domain (pd). In some cases, this leads to having a
> 'negative delta', i.e. placing the task in the pd is seen as an
> energy gain. Thus, any further energy comparison is biased.
>
> In case of a 'negative delta', return prev_cpu since:
> 1. a 'negative delta' happens in less than 0.5% of feec() calls,
> on a Juno with 6 CPUs (4 little, 2 big)
> 2. it is unlikely to have two consecutive 'negative delta' for
> a task, so if the first call fails, feec() will correctly
> place the task in the next feec() call
> 3. EAS current behavior tends to select prev_cpu if the task
> doesn't raise the OPP of its current pd. prev_cpu is EAS's
> generic decision
> 4. prev_cpu should be preferred to returning an error code.
> In the latter case, select_idle_sibling() would do the placement,
> selecting a big (and not energy efficient) CPU. As 3., the task
> would potentially reside on the big CPU for a long time
>
> Reported-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
> Suggested-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>
> Signed-off-by: Pierre Gondois <Pierre.Gondois@xxxxxxx>
> ---

I've been testing this patch on the Google's Pixel4, with a modified kernel that
we are using to evalute mailine performance and energy consumption for a
"real-life" mobile usage.

As always, I ran the Work2.0 workload from PCMark on Android. With that setup I
haven't observed any statistically significant performance change neither any CPU
Idle residency modification. Nevertheless, this code protected against ~600 bad
computations (and by extent bad placements) during a single PCMark iteration
and by looking at the traces, this is saving from spurious wake-ups that would
otherwise happen on the biggest CPUs of the system.

+ Reviewed-by: Vincent Donnefort <vincent.donnefort@xxxxxxx>