Re: [PATCH v8 00/16] Add support for Clang LTO

From: Fangrui Song
Date: Tue Dec 08 2020 - 23:56:40 EST



On 2020-12-08, 'Sami Tolvanen' via Clang Built Linux wrote:
On Tue, Dec 8, 2020 at 4:15 AM Arnd Bergmann <arnd@xxxxxxxxxx> wrote:

On Tue, Dec 1, 2020 at 10:37 PM 'Sami Tolvanen' via Clang Built Linux
<clang-built-linux@xxxxxxxxxxxxxxxx> wrote:
>
> This patch series adds support for building the kernel with Clang's
> Link Time Optimization (LTO). In addition to performance, the primary
> motivation for LTO is to allow Clang's Control-Flow Integrity (CFI)
> to be used in the kernel. Google has shipped millions of Pixel
> devices running three major kernel versions with LTO+CFI since 2018.
>
> Most of the patches are build system changes for handling LLVM
> bitcode, which Clang produces with LTO instead of ELF object files,
> postponing ELF processing until a later stage, and ensuring initcall
> ordering.
>
> Note that arm64 support depends on Will's memory ordering patches
> [1]. I will post x86_64 patches separately after we have fixed the
> remaining objtool warnings [2][3].
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/lto
> [2] https://lore.kernel.org/lkml/20201120040424.a3wctajzft4ufoiw@treble/
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=objtool-vmlinux
>
> You can also pull this series from
>
> https://github.com/samitolvanen/linux.git lto-v8

I've tried pull this into my randconfig test tree to give it a spin.

Great, thank you for testing this!

So far I have
not managed to get a working build out of it, the main problem so far being
that it is really slow to build because the link stage only uses one CPU.
These are the other issues I've seen so far:

ld.lld ThinLTO uses the number of (physical cores enabled by affinity) by default.

You may want to limit your testing only to ThinLTO at first, because
full LTO is going to be extremely slow with larger configs, especially
when building arm64 kernels.

- one build seems to take even longer to link. It's currently at 35GB RAM
usage and 40 minutes into the final link, but I'm worried it might
not complete
before it runs out of memory. I only have 128GB installed, and google-chrome
uses another 30GB of that, and I'm also doing some other builds in parallel.
Is there a minimum recommended amount of memory for doing LTO builds?

When building arm64 defconfig, the maximum memory usage I measured
with ThinLTO was 3.5 GB, and with full LTO 20.3 GB. I haven't measured
larger configurations, but I believe LLD can easily consume 3-4x that
much with full LTO allyesconfig.

- One build failed with
ld.lld -EL -maarch64elf -mllvm -import-instr-limit=5 -r -o vmlinux.o
-T .tmp_initcalls.lds --whole-archive arch/arm64/kernel/head.o
init/built-in.a usr/built-in.a arch/arm64/built-in.a kernel/built-in.a
certs/built-in.a mm/built-in.a fs/built-in.a ipc/built-in.a
security/built-in.a crypto/built-in.a block/built-in.a
arch/arm64/lib/built-in.a lib/built-in.a drivers/built-in.a
sound/built-in.a net/built-in.a virt/built-in.a --no-whole-archive
--start-group arch/arm64/lib/lib.a lib/lib.a
./drivers/firmware/efi/libstub/lib.a --end-group
"ld.lld: error: arch/arm64/kernel/head.o: invalid symbol index"
after about 30 minutes

That's interesting. Did you use LLVM_IAS=1?

May be worth checking which relocation or (SHT_GROUP section's sh_info) in arch/arm64/kernel/head.o is incorrect.

- CONFIG_CPU_BIG_ENDIAN doesn't seem to work with lld, and LTO
doesn't work with ld.bfd.
I've added a CPU_LITTLE_ENDIAN dependency to
ARCH_SUPPORTS_LTO_CLANG{,THIN}

Ah, good point. I'll fix this in v9.

Full/Thin LTO should work with GNU ld and gold with LLVMgold.so built from
llvm-project (https://llvm.org/docs/GoldPlugin.html ). You'll need to make sure
that LLVMgold.so is newer than clang. (Newer clang may introduce bitcode
attributes which are unrecognizable by older LLVMgold.so/ld.lld)

[...]
Not sure if these are all known issues. If there is one you'd like me try
take a closer look at for finding which config options break it, I can try

No, none of these are known issues. I would be happy to take a closer
look if you can share configs that reproduce these.

Sami

--
You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CABCJKueCHo2RYfx_A21m%2B%3Dd1gQLR9QsOOxCsHFeicCqyHkb-Kg%40mail.gmail.com.