Re: [PATCH v8 00/16] Add support for Clang LTO
From: Arnd Bergmann
Date: Tue Dec 08 2020 - 16:01:27 EST
On Tue, Dec 8, 2020 at 5:43 PM 'Sami Tolvanen' via Clang Built Linux
<clang-built-linux@xxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Dec 8, 2020 at 4:15 AM Arnd Bergmann <arnd@xxxxxxxxxx> wrote:
> >
> > On Tue, Dec 1, 2020 at 10:37 PM 'Sami Tolvanen' via Clang Built Linux
> > <clang-built-linux@xxxxxxxxxxxxxxxx> wrote:
> > >
> > > This patch series adds support for building the kernel with Clang's
> > > Link Time Optimization (LTO). In addition to performance, the primary
> > > motivation for LTO is to allow Clang's Control-Flow Integrity (CFI)
> > > to be used in the kernel. Google has shipped millions of Pixel
> > > devices running three major kernel versions with LTO+CFI since 2018.
> > >
> > > Most of the patches are build system changes for handling LLVM
> > > bitcode, which Clang produces with LTO instead of ELF object files,
> > > postponing ELF processing until a later stage, and ensuring initcall
> > > ordering.
> > >
> > > Note that arm64 support depends on Will's memory ordering patches
> > > [1]. I will post x86_64 patches separately after we have fixed the
> > > remaining objtool warnings [2][3].
> > >
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/lto
> > > [2] https://lore.kernel.org/lkml/20201120040424.a3wctajzft4ufoiw@treble/
> > > [3] https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=objtool-vmlinux
> > >
> > > You can also pull this series from
> > >
> > > https://github.com/samitolvanen/linux.git lto-v8
> >
> > I've tried pull this into my randconfig test tree to give it a spin.
>
> Great, thank you for testing this!
>
> > So far I have
> > not managed to get a working build out of it, the main problem so far being
> > that it is really slow to build because the link stage only uses one CPU.
> > These are the other issues I've seen so far:
>
> You may want to limit your testing only to ThinLTO at first, because
> full LTO is going to be extremely slow with larger configs, especially
> when building arm64 kernels.
Ok, that seems to solve most of the remaining problems after I fixed
the module linking bug I introduced.
> > - one build seems to take even longer to link. It's currently at 35GB RAM
> > usage and 40 minutes into the final link, but I'm worried it might
> > not complete
> > before it runs out of memory. I only have 128GB installed, and google-chrome
> > uses another 30GB of that, and I'm also doing some other builds in parallel.
> > Is there a minimum recommended amount of memory for doing LTO builds?
>
> When building arm64 defconfig, the maximum memory usage I measured
> with ThinLTO was 3.5 GB, and with full LTO 20.3 GB. I haven't measured
> larger configurations, but I believe LLD can easily consume 3-4x that
> much with full LTO allyesconfig.
Ok, that's not too bad then. Is there actually a reason to still
support full-lto
in your series? As I understand it, full LTO was the initial approach and
used to work better, but thin LTO is actually what we want to use in the
long run. Perhaps dropping the full LTO option from your series now
that thin LTO works well enough and uses less resources would help
avoid some of the problems.
> > - One build failed with
> > ld.lld -EL -maarch64elf -mllvm -import-instr-limit=5 -r -o vmlinux.o
> > -T .tmp_initcalls.lds --whole-archive arch/arm64/kernel/head.o
> > init/built-in.a usr/built-in.a arch/arm64/built-in.a kernel/built-in.a
> > certs/built-in.a mm/built-in.a fs/built-in.a ipc/built-in.a
> > security/built-in.a crypto/built-in.a block/built-in.a
> > arch/arm64/lib/built-in.a lib/built-in.a drivers/built-in.a
> > sound/built-in.a net/built-in.a virt/built-in.a --no-whole-archive
> > --start-group arch/arm64/lib/lib.a lib/lib.a
> > ./drivers/firmware/efi/libstub/lib.a --end-group
> > "ld.lld: error: arch/arm64/kernel/head.o: invalid symbol index"
> > after about 30 minutes
>
> That's interesting. Did you use LLVM_IAS=1?
I think I did, but it's possible that one of my build scripts didn't pass
that along correctly. This one seems to be gone with thin LTO.
> [...]
> > Not sure if these are all known issues. If there is one you'd like me try
> > take a closer look at for finding which config options break it, I can try
>
> No, none of these are known issues. I would be happy to take a closer
> look if you can share configs that reproduce these.
Attaching the config for "ld.lld: error: Never resolved function from
blockaddress (Producer: 'LLVM12.0.0' Reader: 'LLVM 12.0.0')"
Arnd
Attachment:
0xD8CF9320_defconfig
Description: Binary data