Re: [RFC PATCH v4 12/12] OPTIONAL: cpufreq: dt: Register an Energy Model

From: Vincent Guittot
Date: Mon Jul 30 2018 - 11:53:38 EST


Hi Quentin,

I have noticed the new version but prefer to continue on this thread
to keep the history of the discussion.

On Fri, 6 Jul 2018 at 12:19, Quentin Perret <quentin.perret@xxxxxxx> wrote:
>
> On Friday 06 Jul 2018 at 12:10:02 (+0200), Vincent Guittot wrote:
> > On Thu, 28 Jun 2018 at 13:41, Quentin Perret <quentin.perret@xxxxxxx> wrote:
> > > +static int of_est_power(unsigned long *mW, unsigned long *KHz, int cpu)
> > > +{
> > > + unsigned long mV, Hz, MHz;
> > > + struct device *cpu_dev;
> > > + struct dev_pm_opp *opp;
> > > + struct device_node *np;
> > > + u32 cap;
> > > + u64 tmp;
> > > +
> > > + cpu_dev = get_cpu_device(cpu);
> > > + if (!cpu_dev)
> > > + return -ENODEV;
> > > +
> > > + np = of_node_get(cpu_dev->of_node);
> > > + if (!np)
> > > + return -EINVAL;
> > > +
> > > + if (of_property_read_u32(np, "dynamic-power-coefficient", &cap))
> > > + return -EINVAL;
> > > +
> > > + Hz = *KHz * 1000;
> > > + opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> > > + if (IS_ERR(opp))
> > > + return -EINVAL;
> > > +
> > > + mV = dev_pm_opp_get_voltage(opp) / 1000;
> > > + dev_pm_opp_put(opp);
> > > +
> > > + MHz = Hz / 1000000;
> > > + tmp = (u64)cap * mV * mV * MHz;
> > > + do_div(tmp, 1000000000);
> >
> > Could you explain the formula above ? and especially the 1000000000 it
> > seems related to the use of mV and mW instead of uV and uW ...
>
> That's correct, this is just a matter of units. I simply tried to
> reproduce here what is currently done for IPA:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/thermal/cpu_cooling.c#L252

ok, so you copy/paste what is done in cpu cooling device ?

Nevertheless I still have some concerns with the formula used here and
in cpu cooling device:

https://elixir.bootlin.com/linux/latest/source/Documentation/devicetree/bindings/arm/cpus.txt#L273
The documentation says
Pdyn = dynamic-power-coefficient * V^2 * f where voltage is in uV,
frequency is in MHz.
and dynamic power coefficient in units of mW/MHz/uV^2.

This implies that the results is in mW and we can summarize the formula with :

mW = C * (uV)^2 * Mhz with C = dynamic-power-coefficient
then uV = 1000*mV
mW = C * (1000*mV)^2 * Mhz
mW = C * 1000000 * mV * mV * Mhz

Which is different from what is used here and in cpu cooling device.
I may have done something wrong with the formula but i can't catch
it... My brain is not yet fully back from vacations

Mhz = Hz / 1000000

mW = C * mV * mV * 1000000 * Hz / 1000000
mW = C * mV * mV * Hz

Let take a simple example with
C = 1
Voltage = 1000 uV = 1 mV
freq = 2 Mhz = 2000000 Hz

mW = C * (uV)^2 * Mhz = 1 * (1000)^2 * 2 = 2000000
mW = C * mV * mV * Hz = 1 * 1 * 1 * 2000000 = 2000000
mW = C * mV * mV * MHz / 1000000000 = 1 * 1 * 1 * 2 / 1000000000 = 0

so there is a mismatch between the formula in the documentation and
the formula used in cpu_cooling.c (and in this patch).

That being said, I can understand that we can easily overflow the
result if we use some kind of realistic values like the one for
hikey960:
dynamic-power-coefficient for cpu4 is : 550
for highest OPP of cpu4:
opp-hz = /bits/ 64 <2362000000>;
opp-microvolt = <1100000>;

Which gives with the formula in the documentation:
mW = 550 * (1100000)^2 * 2362 = 1,571911Ã10^18 which is quite huge
even for a big core running at max freq
and with the formula in cpu_cooling.c:
mW = 550 * 1100 * 1100 * 2362 / 1000000000 = 1571 which seems to be a
bit more reasonable

So it seems that the dynamic-power-coefficient unit is not mW/MHz/uV^2
but mW/Ghz/V^2
And given that we use integer, we provide Voltage and frequency by
(mV/1000) and (Mhz/1000) and factorize all /1000

May be Javi who added the formula for cpu cooling can confirm on this ?

>
> >
> > Can't you just optimize that into
> > tmp = (u64)cap * mV * mV * Hz;
> > do_div(tmp, 1000);

In fact the do_div is not needed, I thought that dyn coef unit was
uW/Mhz/uV^2 whereas it's written mW/MHz/uV^2

>
> I don't think that'll work. If you use Hz instead of MHz you'll need to
> divide tmp by 1000000000000000 to get milli-watts, as specified by the
> EM framework.
>
> FYI, I don't consider this patch to be ready to be merged. I kept it
> self-contained and simple for the example but I actually think that this
> of_est_power function should probably be factored-out somewhere else.
> scpi-cpufreq could use it as well. Anyway, I guess we can discuss that
> later once the APIs of the EM framework are agreed :-)

It's just that the tests results that you provided in the cover letter
has used this patch and running tests with wrong formula (it seems
that's the documentation is wrong) doesn't give much confidence on the
results

Thanks
Vincent
>
> Thanks,
> Quentin