Re: [RFC PATCH 2/8] Documentation: arm: define DT cpu capacity bindings

From: Juri Lelli
Date: Tue Dec 01 2015 - 06:20:27 EST

Next message: Liguo Zhang (åçå): "RE: [PATCH v2 2/2] i2c: mediatek: fix i2c multi transfer issue in high speed mode"
Previous message: Julia Lawall: "Re: [PATCH v2 11/25] coccinelle: nand: detect and correct drivers embedding an mtd_info object"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Vincent,

On 30/11/15 10:59, Vincent Guittot wrote:
> Hi Juri,
>
> On 24 November 2015 at 11:54, Juri Lelli <juri.lelli@xxxxxxx> wrote:
> > Hi,
> >
> > On 23/11/15 20:06, Rob Herring wrote:
> >> On Mon, Nov 23, 2015 at 02:28:35PM +0000, Juri Lelli wrote:
> >> > ARM systems may be configured to have cpus with different power/performance
> >> > characteristics within the same chip. In this case, additional information
> >> > has to be made available to the kernel (the scheduler in particular) for it
> >> > to be aware of such differences and take decisions accordingly.
> >> >
>
> [snip]
>
> >> > +==========================================
> >> > +2 - CPU capacity definition
> >> > +==========================================
> >> > +
> >> > +CPU capacity is a number that provides the scheduler information about CPUs
> >> > +heterogeneity. Such heterogeneity can come from micro-architectural differences
> >> > +(e.g., ARM big.LITTLE systems) or maximum frequency at which CPUs can run
> >> > +(e.g., SMP systems with multiple frequency domains). Heterogeneity in this
> >> > +context is about differing performance characteristics; this binding tries to
> >> > +capture a first-order approximation of the relative performance of CPUs.
> >> > +
> >> > +One simple way to estimate CPU capacities is to iteratively run a well-known
> >> > +CPU user space benchmark (e.g, sysbench, dhrystone, etc.) on each CPU at
> >> > +maximum frequency and then normalize values w.r.t. the best performing CPU.
> >> > +One can also do a statistically significant study of a wide collection of
> >> > +benchmarks, but pros of such an approach are not really evident at the time of
> >> > +writing.
> >> > +
> >> > +==========================================
> >> > +3 - capacity-scale
> >> > +==========================================
> >> > +
> >> > +CPUs capacities are defined with respect to capacity-scale property in the cpus
> >> > +node [1]. The property is optional; if not defined a 1024 capacity-scale is
> >> > +assumed. This property defines both the highest CPU capacity present in the
> >> > +system and granularity of CPU capacity values.
> >>
> >> I don't really see the point of this vs. having an absolute scale.
> >>
> >
> > IMHO, we need this for several reasons, one being to address one of your
> > concerns below: vendors are free to choose their scale without being
> > forced to publish absolute data. Another reason is that it might make
> > life easier in certain cases; for example, someone could implement a
> > system with a few clusters of, say, A57s, but some run at half the clock
> > of the others (e.g., you have a 1.2GHz cluster and a 600MHz cluster); in
> > this case I think it is just easier to define capacity-scale as 1200 and
> > capacities as 1200 and 600. Last reason that I can think of right now is
> > that we don't probably want to bound ourself to some particular range
> > from the beginning, as that range might be enough now, but it could
> > change in the future (as in, right now [1-1024] looks fine for
> > scheduling purposes, but that might change).
>
> Like Rob, i don't really see the benefit of this optional
> capacity-scale property. Parsing the capacity of all cpu nodes should
> give you a range as well.
> IMHO, this property looks like an optimization of the code that will
> parse the dt more than a HW description
>

I agree that we can come up with the same information just looking at
the biggest capacity value of all CPUs and treat that value as
capacity-scale. I just thought that having that explicit made things
clearer, as it could be not easy to immediately see from a DT with many
CPUs which is the biggest capacity value. But, yes, we could remove that
anyway.

Thanks,

- Juri

> >
> >> > +
> >> > +==========================================
> >> > +4 - capacity
> >> > +==========================================
> >> > +
> >> > +capacity is an optional cpu node [1] property: u32 value representing CPU
> >> > +capacity, relative to capacity-scale. It is required and enforced that capacity
> >> > +<= capacity-scale.
> >>
> >> I think you need something absolute and probably per MHz (like
> >> dynamic-power-coefficient property). Perhaps the IPC (instructions per
> >> clock) value?
> >>
> >> In other words, I want to see these numbers have a defined method
> >> of determining them and don't want to see random values from every
> >> vendor. ARM, Ltd. says core X has a value of Y would be good enough for
> >> me. Vendor X's A57 having a value of 2 and Vendor Y's A57 having a
> >> value of 1024 is not what I want to see. Of course things like cache
> >> sizes can vary the performance, but is a baseline value good enough?
> >>
> >
> > A standard reference baseline is what we advocate with this set, but
> > making this baseline work for every vendor's implementation is hardly
> > achievable, IMHO. I don't think we can come up with any number that
> > applies to each and every implementation; you can have different
> > revisions of the same core and vendors might make implementation choices
> > that end up with different peak performance.
> >
> >> However, no vendor will want to publish their values if these are
> >> absolute values relative to other vendors.
> >>
> >
> > Right. That is why I think we need to abstract numbers, as we do with
> > capacity-scale.
> >
> >> If you expect these to need frequent tuning, then don't put them in DT.
> >>
> >
> > I expect that it is possible to come up with a sensible baseline number
> > for a specific platform implementation, so there is value in
> > standardizing how we specify this value and how it is then consumed.
> > Finer grained tuning might then happen both offline (with changes to the
> > mainline DT) and online (using the sysfs interface), but that should
> > only apply to a narrow set of use cases.
> >
> > Thanks,
> >
> > - Juri
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Liguo Zhang (åçå): "RE: [PATCH v2 2/2] i2c: mediatek: fix i2c multi transfer issue in high speed mode"
Previous message: Julia Lawall: "Re: [PATCH v2 11/25] coccinelle: nand: detect and correct drivers embedding an mtd_info object"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]