Re: [PATCH] kbuild: support 'LLVM' to switch the default tools to Clang/LLVM

From: Masahiro Yamada
Date: Sun Apr 05 2020 - 12:46:16 EST


On Sat, Apr 4, 2020 at 3:24 AM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
>
> On Thu, Apr 2, 2020 at 10:17 PM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
> >
> > As Documentation/kbuild/llvm.rst implies, building the kernel with a
> > full set of LLVM tools gets very verbose and unwieldy.
> >
> > Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
> > GCC and Binutils. You can pass LLVM=1 from the command line or as an
> > environment variable. Then, Kbuild will use LLVM toolchains in your
> > PATH environment.
> >
> > Please note LLVM=1 does not turn on the LLVM integrated assembler.
> > You need to explicitly pass AS=clang to use it. When the upstream
> > kernel is ready for the integrated assembler, I think we can make
> > it default.
>
> Having this behavior change over time may be surprising. I'd rather
> that if you want to not use the integrated assembler, you explicitly
> negate it, or just don't use the LLVM=1 syntax, ie. `make CC=clang
> LD=ld.lld ...`.
>
> We could modify how `-no-integrated-as` is chosen when LLVM=1.
>
> make LLVM=1 LLVMIA=0 ... # add `-no-integrated-as`
> # what the flag is doesn't really matter to me, something shorter might be nice.
> make LLVM=1 # use all LLVM tools
>
> Since we got rid of $(AS), it would be appropriate to remove/change it
> there, since no one really relies on AS=clang right now. (We do have 1
> of our 60+ CI targets using it, but we can also change that trivially.
> So I think we have a lot of freedom to change how `-no-integrated-as`
> is set.
>
> This could even be independent of this patch.


I also thought a boolean flag is preferred.

AS=clang will not live long anyway, and
I hesitated to break the compatibility
for the short-term workaround.

But, if this is not a big deal, I can
replace AS=clang with LLVMIA=1.



> >
> > We discussed what we need, and we agreed to go with a simple boolean
> > switch (https://lkml.org/lkml/2020/3/28/494).
> >
> > Some items in the discussion:
> >
> > - LLVM_DIR
> >
> > When multiple versions of LLVM are installed, I just thought supporting
> > LLVM_DIR=/path/to/my/llvm/bin/ might be useful.
> >
> > CC = $(LLVM_DIR)clang
> > LD = $(LLVM_DIR)ld.lld
> > ...
> >
> > However, we can handle this by modifying PATH. So, we decided to not do
> > this.
> >
> > - LLVM_SUFFIX
> >
> > Some distributions (e.g. Debian) package specific versions of LLVM with
> > naming conventions that use the version as a suffix.
> >
> > CC = clang$(LLVM_SUFFIX)
> > LD = ld.lld(LLVM_SUFFIX)
> > ...
> >
> > will allow a user to pass LLVM_SUFFIX=-11 to use clang-11 etc.,
> > but the suffixed versions in /usr/bin/ are symlinks to binaries in
> > /usr/lib/llvm-#/bin/, so this can also be handled by PATH.
> >
> > - HOSTCC, HOSTCXX, etc.
> >
> > We can switch the host compilers in the same way:
> >
> > ifneq ($(LLVM),)
> > HOSTCC = clang
> > HOSTCXX = clang++
> > else
> > HOSTCC = gcc
> > HOSTCXX = g++
> > endif
> >
> > This may the right thing to do, but I could not make up my mind.
> > Because we do not frequently switch the host compiler, a counter
> > solution I had in my mind was to leave it to the default of the
> > system.
> >
> > HOSTCC = cc
> > HOSTCXX = c++
> >
> > Many distributions support update-alternatives to switch the default
> > to GCC, Clang, or whatever, but reviewers were opposed to this
> > approach. So, this commit does not touch the host tools.
>
> update-alternatives assumes you've installed Clang via a package manager?
> $ update-alternatives --list cc
> /usr/bin/gcc
> On my system even though clang and friends are in my PATH.
>
> And previously, there was feedback that maybe folks don't want to
> change `cc` on their systems just for Clang kernel builds.
> https://lkml.org/lkml/2020/3/30/836
> https://lkml.org/lkml/2020/3/30/838
>
> A goal for ClangBuiltLinux is to build a kernel image with no GCC or
> binutils installed on the host. Let the record reflect that. And
> there's been multiple complaints that the existing syntax is too long
> for specifying all of the tools.
>
> LLVM=1 is meant to be one flag. Not `make LLVM=1 HOSTCC=clang
> HOSTCXX=clang`. If folks want fine grain flexibility, use the
> existing command line interface, which this patch does not change.
> LLVM=1 is opinionated, and inflexible, because it makes a strong
> choice to enable LLVM for everything.
>
> Another reason why I don't want to change these over time, and why I
> want them all to be in sync is that there are 4 different CI systems
> for the kernel, and they are currently fragmented in terms of who is
> using what tools:
>
> KernelCI: CC=clang only
> Kbuild test robot aka 0day bot: CC=clang LD=ld.lld
> Linaro TCWG: CC=clang only
> our CI: a complete mix due to combinatorial explosion, but more
> coverage of LLVM than everyone else.
>
> That is a mess that we must solve. Having 1 flag that works
> consistently across systems is one solution. Now if those were all
> using LLVM=1, but some were enabling Clang's integrated assembler, and
> some weren't because we changed the default over time, then we'd be
> right back to this mismatch between systems. I'd much rather draw the
> line in the sand, and say "this is how this flag will work, since day
> 1." Maybe it's too rigid, but it's important to me that if we create
> something new to solve multiple objectives (1. simplifies existing
> interface. 2. turns on everything.) that it does so. It is a partial
> solution, if it eliminates some of the flags while leaving others. I
> want a full solution.
>
> If folks want the flexibility to mix and match tools, the existing
> interface is capable. But for us to track who is using what, we need
> 1 flag that we know is not different depending on the cc of the
> system. Once clang's integrated assembler is good to go, we will
> begin recommending LLVM=1 to everyone. And we want feedback if we
> regress building the host utilities during a kernel build, even if
> there are not many.
>
> I'm on the fence about having all of the above satisfied by one patch,
> or taking this patch as is and following up on the above two points
> (related to disabling `-no-integrated-as` and setting HOSTCC). I
> trust your judgement and respect your decisions, so I'll defer to you
> Masahiro, but I need to make explicit the design goals. Maybe with
> this additional context it can help inform the design.
> Tested-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>


Thanks for the comments.

I'd rather want to do this incrementally,
making sure I am doing right.


The meaning of LLVM=1 may change over time.
It means "the recommended settings" at the moment.

If CI does not want to change the behavior across
kernel versions, it can pass individual variables
explicitly.

--
Best Regards
Masahiro Yamada