Re: [PATCH v7 02/17] kbuild: add support for Clang LTO

From: Sami Tolvanen
Date: Fri Nov 20 2020 - 11:24:06 EST


On Wed, Nov 18, 2020 at 3:49 PM Nick Desaulniers
<ndesaulniers@xxxxxxxxxx> wrote:
>
> On Wed, Nov 18, 2020 at 2:07 PM Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote:
> >
> > This change adds build system support for Clang's Link Time
> > Optimization (LTO). With -flto, instead of ELF object files, Clang
> > produces LLVM bitcode, which is compiled into native code at link
> > time, allowing the final binary to be optimized globally. For more
> > details, see:
> >
> > https://llvm.org/docs/LinkTimeOptimization.html
> >
> > The Kconfig option CONFIG_LTO_CLANG is implemented as a choice,
> > which defaults to LTO being disabled. To use LTO, the architecture
> > must select ARCH_SUPPORTS_LTO_CLANG and support:
> >
> > - compiling with Clang,
> > - compiling inline assembly with Clang's integrated assembler,
> > - and linking with LLD.
> >
> > While using full LTO results in the best runtime performance, the
> > compilation is not scalable in time or memory. CONFIG_THINLTO
> > enables ThinLTO, which allows parallel optimization and faster
> > incremental builds. ThinLTO is used by default if the architecture
> > also selects ARCH_SUPPORTS_THINLTO:
> >
> > https://clang.llvm.org/docs/ThinLTO.html
> >
> > To enable LTO, LLVM tools must be used to handle bitcode files. The
> > easiest way is to pass the LLVM=1 option to make:
> >
> > $ make LLVM=1 defconfig
> > $ scripts/config -e LTO_CLANG
> > $ make LLVM=1
> >
> > Alternatively, at least the following LLVM tools must be used:
> >
> > CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm
> >
> > To prepare for LTO support with other compilers, common parts are
> > gated behind the CONFIG_LTO option, and LTO can be disabled for
> > specific files by filtering out CC_FLAGS_LTO.
> >
> > Signed-off-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> > Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
> > ---
> > Makefile | 19 +++++++-
> > arch/Kconfig | 75 +++++++++++++++++++++++++++++++
> > include/asm-generic/vmlinux.lds.h | 11 +++--
> > scripts/Makefile.build | 9 +++-
> > scripts/Makefile.modfinal | 9 +++-
> > scripts/Makefile.modpost | 21 ++++++++-
> > scripts/link-vmlinux.sh | 32 +++++++++----
> > 7 files changed, 158 insertions(+), 18 deletions(-)
> >
> > diff --git a/Makefile b/Makefile
> > index 8c8feb4245a6..240560e88d69 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -893,6 +893,21 @@ KBUILD_CFLAGS += $(CC_FLAGS_SCS)
> > export CC_FLAGS_SCS
> > endif
> >
> > +ifdef CONFIG_LTO_CLANG
> > +ifdef CONFIG_THINLTO
> > +CC_FLAGS_LTO += -flto=thin -fsplit-lto-unit
> > +KBUILD_LDFLAGS += --thinlto-cache-dir=$(extmod-prefix).thinlto-cache
> > +else
> > +CC_FLAGS_LTO += -flto
> > +endif
> > +CC_FLAGS_LTO += -fvisibility=default
> > +endif
> > +
> > +ifdef CONFIG_LTO
> > +KBUILD_CFLAGS += $(CC_FLAGS_LTO)
> > +export CC_FLAGS_LTO
> > +endif
> > +
> > ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B
> > KBUILD_CFLAGS += -falign-functions=32
> > endif
> > @@ -1473,7 +1488,7 @@ MRPROPER_FILES += include/config include/generated \
> > *.spec
> >
> > # Directories & files removed with 'make distclean'
> > -DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS
> > +DISTCLEAN_FILES += tags TAGS cscope* GPATH GTAGS GRTAGS GSYMS .thinlto-cache
> >
> > # clean - Delete most, but leave enough to build external modules
> > #
> > @@ -1719,7 +1734,7 @@ PHONY += compile_commands.json
> >
> > clean-dirs := $(KBUILD_EXTMOD)
> > clean: rm-files := $(KBUILD_EXTMOD)/Module.symvers $(KBUILD_EXTMOD)/modules.nsdeps \
> > - $(KBUILD_EXTMOD)/compile_commands.json
> > + $(KBUILD_EXTMOD)/compile_commands.json $(KBUILD_EXTMOD)/.thinlto-cache
> >
> > PHONY += help
> > help:
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index 56b6ccc0e32d..a41fcb3ca7c6 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -598,6 +598,81 @@ config SHADOW_CALL_STACK
> > reading and writing arbitrary memory may be able to locate them
> > and hijack control flow by modifying the stacks.
> >
> > +config LTO
> > + bool
> > +
> > +config ARCH_SUPPORTS_LTO_CLANG
> > + bool
> > + help
> > + An architecture should select this option if it supports:
> > + - compiling with Clang,
> > + - compiling inline assembly with Clang's integrated assembler,
> > + - and linking with LLD.
> > +
> > +config ARCH_SUPPORTS_THINLTO
> > + bool
> > + help
> > + An architecture should select this option if it supports Clang's
> > + ThinLTO.
> > +
> > +config THINLTO
> > + bool "Clang ThinLTO"
> > + depends on LTO_CLANG && ARCH_SUPPORTS_THINLTO
> > + default y
> > + help
> > + This option enables Clang's ThinLTO, which allows for parallel
> > + optimization and faster incremental compiles. More information
> > + can be found from Clang's documentation:
> > +
> > + https://clang.llvm.org/docs/ThinLTO.html
> > +
> > + If you say N here, the compiler will use full LTO, which may
> > + produce faster code, but building the kernel will be significantly
> > + slower as the linker won't efficiently utilize multiple threads.
> > +
> > + If unsure, say Y.
>
> I think the order of these new configs makes it so that ThinLTO
> appears above LTO in menuconfig; I don't like that, and wish it came
> immediately after. Does `THINLTO` have to be defined _after_ the
> choice for LTO_NONE/LTO_CLANG, perhaps?
>
> Secondly, I don't like how ThinLTO is a config and not a choice. If I
> don't set ThinLTO, what am I getting? That's a rhetorical question; I
> know its full LTO, and I guess the help text does talk about the
> tradeoffs and what you would get. I guess what's curious to me is
> "why does it display ThinLTO? Why not FullLTO?" I can't help but
> wonder if a kconfig `choice` rather than a `config` would be better
> here, that way it's more obvious the user is making a choice between
> ThinLTO vs Full LTO, rather than the current patches which look like
> "ThinkLTO on/off."

Changing the ThinLTO config to a choice and moving it after the main
LTO config sounds like a good idea to me. I'll see if I can change
this in v8. Thanks!

Sami