Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy

From: Lukasz Luba
Date: Wed Jul 07 2021 - 05:48:14 EST

On 7/7/21 10:37 AM, Vincent Guittot wrote:
On Wed, 7 Jul 2021 at 10:23, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

On 7/7/21 9:00 AM, Vincent Guittot wrote:
On Wed, 7 Jul 2021 at 09:49, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

On 7/7/21 8:07 AM, Vincent Guittot wrote:
On Fri, 25 Jun 2021 at 17:26, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

The Energy Aware Scheduler (EAS) tries to find best CPU for a waking up
task. It probes many possibilities and compares the estimated energy values
for different scenarios. For calculating those energy values it relies on
Energy Model (EM) data and em_cpu_energy(). The precision which is used in
EM data is in milli-Watts (or abstract scale), which sometimes is not
sufficient. In some cases it might happen that two CPUs from different
Performance Domains (PDs) get the same calculated value for a given task
placement, but in more precised scale, they might differ. This rounding
error has to be addressed. This patch prepares EAS code for better
precision in the coming EM improvements.

Could you explain why 32bits results are not enough and you need to
move to 64bits ?

Right now the result is in the range [0..2^32[ mW. If you need more
precision and you want to return uW instead, you will have a result in
the range [0..4kW[ which seems to be still enough

Currently we have the max value limit for 'power' in EM which is
EM_MAX_POWER 0xffff (64k - 1). We allow to register such big power
values ~64k mW (~64Watts) for an OPP. Then based on 'power' we
pre-calculate 'cost' fields:
cost[i] = power[i] * freq_max / freq[i]
So, for max freq the cost == power. Let's use that in the example.

Then the em_cpu_energy() calculates as follow:
cost * sum_util / scale_cpu
We are interested in the first part - the value of multiplication.

But all these are internal computations of the energy model. At the
end, the computed energy that is returned by compute_energy() and
em_cpu_energy(), fits in a long

Let's take a look at existing *10000 precision for x CPUs:
cost * sum_util / scale_cpu =
(64k *10000) * (x * 800) / 1024
which is:
x * ~500mln

So to be close to overflowing u32 the 'x' has to be > (?=) 8
(depends on sum_util).

Sorry but I don't get your point.
This patch is about the return type of compute_energy() and
em_cpu_energy(). And even if we decide to return uW instead of mW,
there is still a lot of margin.

It's not because you need u64 for computing intermediate value that
you must returns u64

The example above shows the need of u64 return value for platforms
which are:
- 32bit
- have e.g. 16 CPUs
- has big power value e.g. ~64k mW
Then let's to the calc:
(64k * 10000) * (16 * 800) / 1024 = ~8000mln = ~8bln

The returned value after applying the whole patch set
won't fit in u32 for such cluster.

We might make *assumption* that the 32bit platforms will not
have bigger number of CPUs in the cluster or won't report
big power values. But I didn't wanted to make such assumption.