Re: [PATCH 1/2] docs: Clarify abstract scale usage for power values in Energy Model

From: Lukasz Luba
Date: Wed Sep 30 2020 - 10:04:07 EST




On 9/30/20 11:55 AM, Rajendra Nayak wrote:

On 9/30/2020 1:55 PM, Lukasz Luba wrote:
Hi Douglas,

On 9/30/20 12:53 AM, Doug Anderson wrote:
Hi,

On Tue, Sep 29, 2020 at 5:16 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:

The Energy Model (EM) can store power values in milli-Watts or in abstract
scale. This might cause issues in the subsystems which use the EM for
estimating the device power, such as:
- mixing of different scales in a subsystem which uses multiple
   (cooling) devices (e.g. thermal Intelligent Power Allocation (IPA))
- assuming that energy [milli-Joules] can be derived from the EM power
   values which might not be possible since the power scale doesn't have to
   be in milli-Watts

To avoid misconfiguration add the needed documentation to the EM and
related subsystems: EAS and IPA.

Signed-off-by: Lukasz Luba <lukasz.luba@xxxxxxx>
---
  .../driver-api/thermal/power_allocator.rst          |  8 ++++++++
  Documentation/power/energy-model.rst                | 13 +++++++++++++
  Documentation/scheduler/sched-energy.rst            |  5 +++++
  3 files changed, 26 insertions(+)

I haven't read through these files in massive detail, but the quick
skim makes me believe that your additions seem sane.  In general, I'm
happy with documenting reality, thus:

Reviewed-by: Douglas Anderson <dianders@xxxxxxxxxxxx>

Thank you for the review.


I will note: you haven't actually updated the device tree bindings.
Thus, presumably, anyone who is specifying these numbers in the device
tree is still supposed to specify them in a way that mW can be
recovered, right?  Said another way: nothing about your patches makes
it OK to specify numbers in device trees using an "abstract scale",
right?

For completeness, we are talking here about the binding from:
Documentation/devicetree/bindings/arm/cpus.yaml
which is 'dynamic-power-coefficient'. Yes, it stays untouched, also the
unit (uW/MHz/V^2) which then allows to have mW in the power
values in the EM.

So for platforms where 'dynamic-power-coefficient' is specified in device tree,
its always expected to be derived from 'real' power numbers on these platforms in
'real' mW?

Yes, the purpose and the name of that binding was only for 'real'
power in mW.


Atleast on Qualcomm platforms we have these numbers scaled, so in essence it
can't be used to derive 'real' mW values. That said we also do not have any of
the 'platform might face potential issue of mixing devices in one thermal zone
of two scales' problem.

If you have these numbers scaled, then it's probably documented
somewhere in your docs for your OEMs, because they might assume it's in
uW/MHz/V^2 (according to the bindings doc). If not, they probably
realized it during the measurements and comparison (that the power in
EM is not what they see on the power meter).
This binding actually helps those developers who take the experiments
and based on measured power values, store derived coefficient.
Everyone can just measure in local setup and compare the results
easily, speaking the same language (proposing maybe a patch adjusting
the value in DT).


So the question is, can such platforms still use 'dynamic-power-coefficient'
in device tree and create an abstract scale? The other way of doing this would
be to *not* specify this value in device tree and have these values stored in the
cpufreq driver and register a custom callback to do the math.

But then we would also have to change the name of that binding.

I'd recommend you the second way that you've described. It will avoid
your OEMs confusion. In your cpufreq driver you can simply register
to EM using the em_dev_register_perf_domain(). In your local
callback you can do whatever you need (read driver array, firmware,
DT, scale or not, etc).
The helper code in dev_pm_opp_of_register_em() is probably not suited
for your use case (when you don't want to share the real power of the
SoC).


It just feels like jumping through hoops just to deal with the fact that the
device tree bindings say its expected to be in mW and can't be abstract.


I don't want to add more confusion into the EM power values topic.
Overloading the meaning of that binding would create more mess.

Regards,
Lukasz