Re: [PATCH] sched/fair: Fix negative energy delta in find_energy_efficient_cpu()
From: Dietmar Eggemann
Date: Fri Apr 23 2021 - 11:48:11 EST
On 22/04/2021 11:44, Pierre Gondois wrote:
> Hi Dietmar,
> Thanks for the review,
>
> On 4/20/21 6:25 PM, Dietmar Eggemann wrote:
>
>> On 20/04/2021 14:56, Pierre.Gondois@xxxxxxx wrote:
>>> From: Pierre Gondois <Pierre.Gondois@xxxxxxx>
[...]
>> Did you run some tests to make sure you didn't regress on energy
>> consumption? You could run EAS' Energy tests w/ and w/o the patch
>> depicted in:
>>
>> https://lkml.kernel.org/r/20181203095628.11858-1-quentin.perret@xxxxxxx
>
>
> I executed the energy test you pointed at using LISA on a Juno-r2 (2xA57
> + 4xA53). The initial tests made by Quentin was on a Juno-r0 and a
> Hikey960.
>
> To recall the test:
> "10 iterations of between 10 and 50 periodic rt-app tasks (16ms period,
> 5% duty-cycle) for 30 seconds with energy measurement. Unit is Joules.
> The goal is to save energy, so lower is better."
> "Energy is measured with the onboard energy meter. Numbers include
> consumption of big and little CPUs."
>
> +----------+-----------------+-------------------------+
> | | Without patches | With patches |
> +----------+--------+--------+------------------+------+
> | Tasks nb | Mean | CI* | Mean | CI* |
> +----------+--------+--------+------------------+------+
> | 10 | 6.57 | 0.24 | 6.46 (-1.69%) | 0.16 |
> | 20 | 12.44 | 0.21 | 12.40 (-0.33%) | 0.11 |
> | 30 | 19.10 | 0.78 | 18.93 (-0.89%) | 0.46 |
> | 40 | 27.27 | 0.53 | 27.49 (+0.81% | 0.17 |
> | 50 | 36.55 | 0.42 | 37.21 (+1.80%) | 0.81 |
> +----------+-----------------+-------------------------+
> CI: confidence interval
>
> For each line, the intervals of values w/ wo/ the patches are
> overlapping (consider Mean +/- CI). Thus, the energy results shouldn't
> have been impacted.
Put this into the patch header so people see some testing has been done.
[...]
>>> + if (compute_prev_delta) {
>>> + prev_delta = compute_energy(p, prev_cpu, pd);
>>> + /* Prevent negative deltas and select prev_cpu */
>> Not sure if this comment helps in understanding the code. We don't
>> comment that we return prev_cpu if !task_util_est(p) or we're not
>> entering the `Pick the best CPU ...` condition.
> I thought it was not obvious how (prev_delta < base_energy_pd) could
> happen. I'm ok to remove the comment, but maybe a sentence should be
> added in the function header or somewhere else.
Agreed. Remove the commend and add text in the patch header to
illustrate how you `fix negative energy delta ...`.
[...]
> Same comment: I'm ok to remove it, but we should explain what happens
> somewhere, maybe in the function header.
Same here.
[...]
>>> @@ -6674,25 +6688,20 @@ static int find_energy_efficient_cpu(struct
>>> task_struct *p, int prev_cpu)
>>> }
>>> }
>>> }
>>> -unlock:
>>> - rcu_read_unlock();
>> You don't close the RCU read-side critical section here anymore but
>> include the following if condition as well. Don't we always want to
>> close them as quick as possible? We could still return target (prev_cpu)
>> after the if condition below ...
> The computation should not take that long and this would make less code.
> Putting back the rcu_read_unlock() and returning faster instead of
> having a fall-through would also work for me.
I see but I would stay on the save side and keep the RCU read-side
critical section as short as possible.
>>> /*
>>> - * Pick the best CPU if prev_cpu cannot be used, or if it saves at
>>> - * least 6% of the energy used by prev_cpu.
>>> + * Pick the best CPU if:
>>> + * - prev_cpu cannot be used, or
>>> + * - it saves at least 6% of the energy used by prev_cpu
>>> */
>> Why changing the layout of this comment?
>
> I thought it was clearer to have bullet points. It can be reverted.
Please revert. Keep the changes as small as possible.
[...]