Re: [RFC PATCH 06/16] arm: topology: Define TC2 sched energy and provide it to scheduler

From: Morten Rasmussen
Date: Fri Jun 06 2014 - 09:15:24 EST


On Wed, Jun 04, 2014 at 05:16:18PM +0100, Peter Zijlstra wrote:
> On Wed, Jun 04, 2014 at 04:42:27PM +0100, Morten Rasmussen wrote:
> > On Tue, Jun 03, 2014 at 12:44:28PM +0100, Peter Zijlstra wrote:
> > > On Fri, May 23, 2014 at 07:16:33PM +0100, Morten Rasmussen wrote:
> > > > +static struct capacity_state cap_states_cluster_a7[] = {
> > > > + /* Cluster only power */
> > > > + { .cap = 358, .power = 2967, }, /* 350 MHz */
> > > > + { .cap = 410, .power = 2792, }, /* 400 MHz */
> > > > + { .cap = 512, .power = 2810, }, /* 500 MHz */
> > > > + { .cap = 614, .power = 2815, }, /* 600 MHz */
> > > > + { .cap = 717, .power = 2919, }, /* 700 MHz */
> > > > + { .cap = 819, .power = 2847, }, /* 800 MHz */
> > > > + { .cap = 922, .power = 3917, }, /* 900 MHz */
> > > > + { .cap = 1024, .power = 4905, }, /* 1000 MHz */
> > > > + };
> > > > +
> > > > +static struct capacity_state cap_states_cluster_a15[] = {
> > > > + /* Cluster only power */
> > > > + { .cap = 840, .power = 7920, }, /* 500 MHz */
> > > > + { .cap = 1008, .power = 8165, }, /* 600 MHz */
> > > > + { .cap = 1176, .power = 8172, }, /* 700 MHz */
> > > > + { .cap = 1343, .power = 8195, }, /* 800 MHz */
> > > > + { .cap = 1511, .power = 8265, }, /* 900 MHz */
> > > > + { .cap = 1679, .power = 8446, }, /* 1000 MHz */
> > > > + { .cap = 1847, .power = 11426, }, /* 1100 MHz */
> > > > + { .cap = 2015, .power = 15200, }, /* 1200 MHz */
> > > > + };
> > >
> > >
> > > So how did you obtain these numbers? Did you use numbers provided by the
> > > hardware people, or did you run a particular benchmark and record the
> > > power usage?
> > >
> > > Does that benchmark do some actual work (as opposed to a while(1) loop)
> > > to keep more silicon lit up?
> >
> > Hardware people don't like sharing data, so I did my own measurements
> > and calculations to get the numbers above.
> >
> > ARM TC2 has on-chip energy counters for counting energy consumed by the
> > A7 and A15 clusters. They are fairly accurate.
>
> Recent Intel chips have that too; they come packaged as:
>
> perf stat -a -e "power/energy-cores/" -- cmd
>
> (through the perf_event_intel_rapl.c driver), It would be ideal if the
> ARM equivalent was available through a similar interface.
>
> http://lwn.net/Articles/573602/

Nice. On ARM it is not mandatory to have energy counters and what they
actually measure if they are implemented is implementation dependent.
However, each vendor does extensive evaluation and characterization of
their implementation already, so I don't think would be a problem for
them to provide the numbers.

> > I used sysbench cpu
> > benchmark as test workload for the above numbers. sysbench might not be
> > a representative workload, but it is easy to use. I think, ideally,
> > vendors would run their own mix of workloads they care about and derrive
> > their numbers for their platform based on that.
> >
> > > If you have a setup for measuring these, should we try and publish that
> > > too so that people can run it on their platform and provide these
> > > numbers?
> >
> > The workload setup I used quite simple. I ran sysbench with taskset with
> > different numbers of threads to extrapolate power consumed by each
> > individual cpu and how much comes from just powering on the domain.
> >
> > Measuring the actual power is very platform specific. Developing a fully
> > automated tool do it for any given platform isn't straigt forward, but
> > I'm happy to share how I did it. I can add a description of the method I
> > used on TC2 to the documentation so others can use it as reference.
>
> That would be good I think, esp. if we can get similar perf based energy
> measurement things sorted. And if we make the tool consume the machine
> topology present in sysfs we can get a long way towards automating this
> I think.

Some of the measurements could be automated. Others are hard to
automate as they require extensive knowledge about the platform. wakeup
energy, for example. You may need to do various tricks and hacks to
force the platform to use a specific idle-state so you know what you are
measuring.

I will add the TC2 recipe as a start and then see if my ugly scripts can
be turned into something generally useful.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/