Re: [PATCH v8 00/16] Add support for Clang LTO

From: Nathan Chancellor
Date: Thu Dec 03 2020 - 13:22:46 EST


On Thu, Dec 03, 2020 at 09:07:30AM -0800, Sami Tolvanen wrote:
> On Thu, Dec 3, 2020 at 3:26 AM Will Deacon <will@xxxxxxxxxx> wrote:
> >
> > Hi Sami,
> >
> > On Tue, Dec 01, 2020 at 01:36:51PM -0800, Sami Tolvanen wrote:
> > > This patch series adds support for building the kernel with Clang's
> > > Link Time Optimization (LTO). In addition to performance, the primary
> > > motivation for LTO is to allow Clang's Control-Flow Integrity (CFI)
> > > to be used in the kernel. Google has shipped millions of Pixel
> > > devices running three major kernel versions with LTO+CFI since 2018.
> > >
> > > Most of the patches are build system changes for handling LLVM
> > > bitcode, which Clang produces with LTO instead of ELF object files,
> > > postponing ELF processing until a later stage, and ensuring initcall
> > > ordering.
> > >
> > > Note that arm64 support depends on Will's memory ordering patches
> > > [1]. I will post x86_64 patches separately after we have fixed the
> > > remaining objtool warnings [2][3].
> >
> > I took this series for a spin, with my for-next/lto branch merged in but
> > I see a failure during the LTO stage with clang 11.0.5 because it doesn't
> > understand the '.arch_extension rcpc' directive we throw out in READ_ONCE().
>
> I just tested this with Clang 11.0.0, which I believe is the latest
> 11.x version, and the current Clang 12 development branch, and both
> work for me. Godbolt confirms that '.arch_extension rcpc' is supported
> by the integrated assembler starting with Clang 11 (the example fails
> with 10.0.1):
>
> https://godbolt.org/z/1csGcT
>
> What does running clang --version and ld.lld --version tell you?

11.0.5 is AOSP's clang, which is behind the upstream 11.0.0 release so
it is most likely the case that it is missing the patch that added rcpc.
I think that a version based on the development branch (12.0.0) is in
the works but I am not sure.

> > We actually check that this extension is available before using it in
> > the arm64 Kconfig:
> >
> > config AS_HAS_LDAPR
> > def_bool $(as-instr,.arch_extension rcpc)
> >
> > so this shouldn't happen. I then realised, I wasn't passing LLVM_IAS=1
> > on my Make command line; with that, then the detection works correctly
> > and the LTO step succeeds.
> >
> > Why is it necessary to pass LLVM_IAS=1 if LTO is enabled? I think it
> > would be _much_ better if this was implicit (or if LTO depended on it).
>
> Without LLVM_IAS=1, Clang uses two different assemblers when LTO is
> enabled: the external GNU assembler for stand-alone assembly, and
> LLVM's integrated assembler for inline assembly. as-instr tests the
> external assembler and makes an admittedly reasonable assumption that
> the test is also valid for inline assembly.
>
> I agree that it would reduce confusion in future if we just always
> enabled IAS with LTO. Nick, Nathan, any thoughts about this?

I am personally fine with that. As far as I am aware, we are in a fairly
good spot on arm64 and x86_64 when it comes to the integrated assembler.
Should we make it so that the user has to pass LLVM_IAS=1 explicitly or
we just make adding the no integrated assembler flag to CLANG_FLAGS
depend on not LTO (although that will require extra handling because
Kconfig is not available at that stage I think)?

Cheers,
Nathan