Re: [PATCH] arm64: dts: qcom: sc7180: Add 'sustainable_power' for CPU thermal zones

From: Rajendra Nayak
Date: Thu Sep 03 2020 - 01:35:53 EST



On 9/3/2020 10:14 AM, Rajendra Nayak wrote:

On 9/2/2020 9:02 PM, Doug Anderson wrote:
Hi,

On Tue, Sep 1, 2020 at 10:36 PM Rajendra Nayak <rnayak@xxxxxxxxxxxxxx> wrote:


* In terms of the numbers here, I believe that you're claiming that we
can dissipate 768 mW * 6 + 1202 mW * 2 = ~7 Watts of power.  My memory
of how much power we could dissipate in previous laptops I worked on
is a little fuzzy, but that doesn't seem insane for a passively-cooled
laptop.  However, I think someone could conceivably put this chip in a
smaller form factor.  In such a case, it seems like we'd want these
things to sum up to ~2000 (if it would ever make sense for someone to
put this chip in a phone) or ~4000 (if it would ever make sense for
someone to put this chip in a small tablet).  It seems possible that,
to achieve this, we might have to tweak the
"dynamic-power-coefficient".

DPC values are calculated (at a SoC) by actually measuring max power at various
frequency/voltage combinations by running things like dhrystone.
How would the max power a SoC can generate depend on form factors?
How much it can dissipate sure is, but then I am not super familiar how
thermal frameworks end up using DPC for calculating power dissipated,
I am guessing they don't.

I don't know how much thought was put
into those numbers, but the fact that the little cores have a super
round 100 for their dynamic-power-coefficient makes me feel like they
might have been more schwags than anything.  Rajendra maybe knows?

FWIK, the values are always scaled and normalized to 100 for silver and
then used to derive the relative DPC number for gold. If you see the DPC
for silver cores even on sdm845 is a 100.
Again these are not estimations but based on actual power measurements.

The scaling to 100 doesn't seem to match how the thermal framework is
using them.  Take a look at of_cpufreq_cooling_register().  It takes
the "dynamic-power-coefficient" and passes it as "capacitance" into
__cpufreq_cooling_register().  That's eventually used to compute
power, which is documented in the code to be in mW.

power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
do_div(power, 1000000000);

/* power is stored in mW */
freq_table[i].power = power;

That's used together with "sustainable-power", which is the attribute
that Matthias is trying to set.  That value is documented to be in mW
as well.

...so if the silver cores are always scaled to 100 regardless of how
much power they actually draw then it'll be impossible to actually
think about "sustainable-power" as a mW value.  Presumably we either
need to accept that fact (and ideally document it) or we need to
change the values for silver / gold cores (we could still keep the
relative values the same and just scale them).

That sounds reasonable (still keep the relative values and scale them)
I'll get back on what those scaled numbers would look like, and try to
get some sense of why this scaling to 100 was done (like you said
I don't see any documentation on this), but I see atleast a few other non-qcom
SoCs doing this too in mainline (like rockchip/rk3399)

On second thoughts, why wouldn't a relative 'sustainable-power' value work?
On every device, one would need to do the exercise that Matthias did to come
up with the OPP at which we can sustain max CPU/GPU loads anyway.
I mean even if we do change the DPC values to match actual power, Matthias would
still observe that we can sustain at the very same OPP and not any different.
Its just that the mW values that are passed to kernel are relative and not
absolute. My worry is that perhaps no SoC vendor wants to put these absolute numbers
out.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation