Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy

From: Lukasz Luba
Date: Wed Jul 07 2021 - 03:49:53 EST

Next message: Vincent Guittot: "Re: [PATCH] nohz: nohz idle balancing per node"
Previous message: Jiri Slaby: "Re: [PATCH v4] tty: serial: jsm: allocate queue buffer at probe time"
In reply to: Vincent Guittot: "Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy"
Next in thread: Vincent Guittot: "Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 7/7/21 8:07 AM, Vincent Guittot wrote:

On Fri, 25 Jun 2021 at 17:26, Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

The Energy Aware Scheduler (EAS) tries to find best CPU for a waking up
task. It probes many possibilities and compares the estimated energy values
for different scenarios. For calculating those energy values it relies on
Energy Model (EM) data and em_cpu_energy(). The precision which is used in
EM data is in milli-Watts (or abstract scale), which sometimes is not
sufficient. In some cases it might happen that two CPUs from different
Performance Domains (PDs) get the same calculated value for a given task
placement, but in more precised scale, they might differ. This rounding
error has to be addressed. This patch prepares EAS code for better
precision in the coming EM improvements.

Could you explain why 32bits results are not enough and you need to
move to 64bits ?

Right now the result is in the range [0..2^32[ mW. If you need more
precision and you want to return uW instead, you will have a result in
the range [0..4kW[ which seems to be still enough

Currently we have the max value limit for 'power' in EM which is
EM_MAX_POWER 0xffff (64k - 1). We allow to register such big power
values ~64k mW (~64Watts) for an OPP. Then based on 'power' we
pre-calculate 'cost' fields:
cost[i] = power[i] * freq_max / freq[i]
So, for max freq the cost == power. Let's use that in the example.

Then the em_cpu_energy() calculates as follow:
cost * sum_util / scale_cpu
We are interested in the first part - the value of multiplication.

The sum_util values that we can see for x CPUs which have scale_cap=1024
can be close to 800, let's use it in the example:
cost * sum_util = 64k * (x * 800), where
x=4: ~200mln
x=8: ~400mln
x=16: ~800mln
x=64: ~3200mln (last one which would fit in u32)

When we increase the precision by even 100, then the above values won't
fit in the u32. Even a max cost of e.g. 10k mW and 100 precision has
issues:
cost * sum_util = (10k *100) * (x * 800), where
x=4: ~3200mln
x=8: ~6400mln

For *1000 precision even a power of 1Watt becomes an issue:
cost * sum_util = (1k *1000) * (x * 800), where
x=4: ~3200mln
x=8: ~6400mln

That's why to make the code safe for bigger power values, I had to use
the u64 on 32bit machines.

Next message: Vincent Guittot: "Re: [PATCH] nohz: nohz idle balancing per node"
Previous message: Jiri Slaby: "Re: [PATCH v4] tty: serial: jsm: allocate queue buffer at probe time"
In reply to: Vincent Guittot: "Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy"
Next in thread: Vincent Guittot: "Re: [PATCH 1/3] sched/fair: Prepare variables for increased precision of EAS estimated energy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]