Re: [RFC PATCH] x86: enable dead code and data elimination (LTO)

From: Nicholas Piggin
Date: Sun Jul 09 2017 - 22:14:10 EST


On Sun, 9 Jul 2017 09:59:44 -0400 (EDT)
Nicolas Pitre <nicolas.pitre@xxxxxxxxxx> wrote:

> On Sun, 9 Jul 2017, Masahiro Yamada wrote:
>
> > Hi.
> >
> > 2017-07-09 18:05 GMT+09:00 Ingo Molnar <mingo@xxxxxxxxxx>:
> > >
> > > * Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
> > >
> > >> FYI, easiest way to check if you forgot to KEEP a linker table is
> > >> to look at `readelf -S vmlinux` differences, and to see what is
> > >> being trimmed, look at nm differences or use --print-gc-sections
> > >> LD option to see what symbols you're trimming. Linker tables,
> > >> boot entry, and exception entry tends to require anchoring.
> > >
> > > Could you please add a debug build target to display all discarded
> > > symbols/sections? Something like:
> > >
> > > make lto-check
> > >
> > > ... or so?

Some kind of option like this could be a good idea. It could apply
to any kind of link-time optimization we do. I'll think about it.

> > >
> > > Thanks,
> > >
> > > Ingo
> >
> >
> > Actually, LTO activity existed some years ago
> > (but not pulled in).
> >
> > http://www.spinics.net/lists/linux-kbuild/msg09242.html
> >
> >
> > IIUC, this patch enables "dead code elimination",
> > (or "garbage collection"?),
> > but I think it is different from what is called LTO.
>
> Yes, it is different. With gc-sections the linker simply drops code
> sections that have no references to them.

Yes, I shouldn't have confused the terms. gc-sections is a trivial
form of LTO, but not "LTO".

> This is therefore fast and low
> complexity. LTO postpones the compiler's code optimization passes at
> the point where everything is linked together and can do things like
> constant propagation across multiple files, etc. LTO is therefore more
> efficient at removing unused code but compilation time is much longer
> due to the added complexity and inherent difficulty to parallelize the
> operation across multiple CPUS.
>
> I think we should aim for gc-sections to be used by default and have LTO
> as a possible option only.

I agree after it starts getting implemented and debugged by small
system users, we could make it default in the interest of sharing
testing and reducing combinations.

Thanks,
Nick