Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy

From: Lukasz Luba
Date: Wed Jul 07 2021 - 05:54:24 EST

Next message: Janis Schoetterl-Glausch: "Re: [PATCH] KVM: s390: Enable specification exception interpretation"
Previous message: Ding Hui: "Re: [PATCH v2] x86/mce: Fix endless loop when run task works after #MC"
In reply to: Dietmar Eggemann: "Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 7/7/21 10:45 AM, Dietmar Eggemann wrote:

On 07/07/2021 10:23, Lukasz Luba wrote:

On 7/7/21 9:00 AM, Vincent Guittot wrote:
On Wed, 7 Jul 2021 at 09:49, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

On 7/7/21 8:07 AM, Vincent Guittot wrote:

On Fri, 25 Jun 2021 at 17:26, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

[...]

Could you explain why 32bits results are not enough and you need to
move to 64bits ?

Right now the result is in the range [0..2^32[ mW. If you need more
precision and you want to return uW instead, you will have a result in
the range [0..4kW[ which seems to be still enough

Currently we have the max value limit for 'power' in EM which is
EM_MAX_POWER 0xffff (64k - 1). We allow to register such big power
values ~64k mW (~64Watts) for an OPP. Then based on 'power' we
pre-calculate 'cost' fields:
cost[i] = power[i] * freq_max / freq[i]
So, for max freq the cost == power. Let's use that in the example.

Then the em_cpu_energy() calculates as follow:
cost * sum_util / scale_cpu
We are interested in the first part - the value of multiplication.

But all these are internal computations of the energy model. At the
end, the computed energy that is returned by compute_energy() and
em_cpu_energy(), fits in a long

Let's take a look at existing *10000 precision for x CPUs:
cost * sum_util / scale_cpu =
(64k *10000) * (x * 800) / 1024
which is:
x * ~500mln

So to be close to overflowing u32 the 'x' has to be > (?=) 8
(depends on sum_util).

I assume the worst case is `x * 1024` (max return value of
effective_cpu_util = effective_cpu_util()) so x ~ 6.7.

I'm not aware of any arm32 b.L. systems with > 4 CPUs in a PD.

True, arm32 didn't support bigger number than 4 CPUs in the cluster.
We would be safe for them, but I don't want to break with this
assumption any other 32bit platform from competitors, which might
create such 32bit 16cores clusters.

If Peter, Vincent and you are OK to put this assumption about
max safe CPUs number, then we can get rid of patch 1/3.

But the temporary division of u64 must stay, because there is
arm32 platform which need it. So returning also u64 is not a big
harm and looks more consistent.

Next message: Janis Schoetterl-Glausch: "Re: [PATCH] KVM: s390: Enable specification exception interpretation"
Previous message: Ding Hui: "Re: [PATCH v2] x86/mce: Fix endless loop when run task works after #MC"
In reply to: Dietmar Eggemann: "Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]