Re: [GIT] kbuild/lto changes for 3.15-rc1

From: Ingo Molnar
Date: Mon Apr 14 2014 - 06:55:52 EST



* Markus Trippelsdorf <markus@xxxxxxxxxxxxxxx> wrote:

> On 2014.04.14 at 12:32 +0200, Ingo Molnar wrote:
> >
> > * Markus Trippelsdorf <markus@xxxxxxxxxxxxxxx> wrote:
> >
> > > On 2014.04.09 at 08:01 +0200, Ingo Molnar wrote:
> > > >
> > > > * Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > On Tue, Apr 08, 2014 at 03:44:25PM -0700, Linus Torvalds wrote:
> > > > > > On Tue, Apr 8, 2014 at 1:49 PM, <josh@xxxxxxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > In addition to making the kernel smaller and such (I'll leave the
> > > > > > > specific stats there to Andi), here's the key awesomeness of LTO that
> > > > > > > you, personally, should find useful and compelling: LTO will eliminate
> > > > > > > the need to add many lower-level Kconfig symbols to compile out bits of
> > > > > > > the kernel.
> > > > > >
> > > > > > Actually that, to me, is a negative right now.
> > > > > >
> > > > > > Since there's no way we'll make LTO the default in the foreseeable
> > > > > > future, people starting to use it like that is just a bad bad thing.
> > > > > >
> > > > > > So really, the main advantage of LTO would be any actual
> > > > > > optimizations it can do. And call me anal, but I want *numbers*
> > > > > > for that before I merge it. Not handwaving. I'm not actually aware
> > > > > > of how well - if at all - code generation actually improves.
> > > > >
> > > > > Well it looks very different if you look at the generated code. gcc
> > > > > becomes a lot more aggressive.
> > > > >
> > > > > But as I said there's currently no significant performance
> > > > > improvement known, so if your only goal is better performance this
> > > > > patch (as currently) known is not a big winner. My suspicion is
> > > > > that's mostly because the standard benchmarks we run are not too
> > > > > compiler sensitive.
> > > > >
> > > > > However the users seem to care about the other benefits, like code
> > > > > size.
> > > > >
> > > > > And there may well be loads that are compiler sensitive. As Honza
> > > > > posted, for non kernel workloads LTO is known to have large
> > > > > benefits.
> > > > >
> > > > > Besides at this point it's pretty much just some additions to the
> > > > > Makefiles.
> > > >
> > > > So the reason I've been mostly ignoring the LTO patches myself (I only
> > > > took LTO related changes that had other justifications such as
> > > > cleanups) is that I've actually implemented full LTO in a userspace
> > > > project myself, and my experience was:
> > > >
> > > > 1) There was very little if any measurable LTO runtime speedup,
> > > > despite agressive GCC options and despite user-space generally
> > > > offering more optimizations opportunities than kernel space.
> > > >
> > > > 2) LTO with current build tools meant a 1.5x-3x build speed
> > > > slowdown (on a very fast box with tons of CPUs and RAM),
> > > > which made LTO essentially a non-starter for development
> > > > work. (And that was with the Gold linker.)
> > > >
> > > > and looking at your characterisation of LTO you only conceded
> > > > #1 much after you started pushing LTO and you are clearly trying
> > > > to avoid talking about #2 while it's very much relevant...
> > > >
> > > > I'm willing to be convinced by actual numbers, and LTO tooling might
> > > > eventually improve, etc., but right now LTO is much ado about very
> > > > little, being pushed in a somewhat dishonest way.
> > >
> > > I did some measurements on Andi's lto-3.14 branch:
> > >
> > > options size build time
> > > ------------------------------
> > > -O2 4408880 1:56.98
> > > -flto -O2 4213072 2:36.22
> > > -Os 3833248 1:45.13
> > > -flto -Os 3651504 2:34.51
> > >
> > > This was measured on my AMD 4 core machine with a monolithic .config
> > > where "CONFIG_MODULES is not set". The compiler is gcc trunk (4.9).
> > > So on x86_86 you get 5% size reduction for 25-30% build time slowdown.
> >
> > Note that the build slowdowns you measured are more like 30-45%:
> >
> > 156.22/116.98 == 33.5% slowdown
> > 154.51/105.13 == 46.9% slowdown
> >
> > not 25-30%.
>
> It's a matter of definition. I've computed the slowdown by taking the
> difference: (a - b) * 100 / a

That definition is IMHO deficient to the level of being broken - for
example "it got slower by 300%" would be impossible to achieve, as
slowdowns get larger it just converges to 100% from below, which is
not very intuitive.

So the typical definition for "X got slower by Y%" is:

Z = X*1.Y

For small deltas it does not matter, for larger ones it does. Anyway,
we seem to agree :)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/