Re: [PATCH] kbuild: support 'LLVM' to switch the default tools to Clang/LLVM

From: Fangrui Song
Date: Sun Apr 05 2020 - 19:55:14 EST


On 2020-04-06, Masahiro Yamada wrote:
On Sat, Apr 4, 2020 at 3:24 AM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:

On Thu, Apr 2, 2020 at 10:17 PM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
>
> As Documentation/kbuild/llvm.rst implies, building the kernel with a
> full set of LLVM tools gets very verbose and unwieldy.
>
> Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
> GCC and Binutils. You can pass LLVM=1 from the command line or as an
> environment variable. Then, Kbuild will use LLVM toolchains in your
> PATH environment.
>
> Please note LLVM=1 does not turn on the LLVM integrated assembler.
> You need to explicitly pass AS=clang to use it. When the upstream
> kernel is ready for the integrated assembler, I think we can make
> it default.

Having this behavior change over time may be surprising. I'd rather
that if you want to not use the integrated assembler, you explicitly
negate it, or just don't use the LLVM=1 syntax, ie. `make CC=clang
LD=ld.lld ...`.

We could modify how `-no-integrated-as` is chosen when LLVM=1.

make LLVM=1 LLVMIA=0 ... # add `-no-integrated-as`
# what the flag is doesn't really matter to me, something shorter might be nice.
make LLVM=1 # use all LLVM tools

Since we got rid of $(AS), it would be appropriate to remove/change it
there, since no one really relies on AS=clang right now. (We do have 1
of our 60+ CI targets using it, but we can also change that trivially.
So I think we have a lot of freedom to change how `-no-integrated-as`
is set.

This could even be independent of this patch.


I also thought a boolean flag is preferred.

AS=clang will not live long anyway, and
I hesitated to break the compatibility
for the short-term workaround.

But, if this is not a big deal, I can
replace AS=clang with LLVMIA=1.

My mere complaint is that it may be difficult to infer the intention (integrated
assembler) from the abbreviation "IA" in "LLVMIA" :/

Something with "AS" in the name may be easier for a user to understand,
e.g. CLANG_AS or LLVM_AS.

>
> We discussed what we need, and we agreed to go with a simple boolean
> switch (https://lkml.org/lkml/2020/3/28/494).
>
> Some items in the discussion:
>
> - LLVM_DIR
>
> When multiple versions of LLVM are installed, I just thought supporting
> LLVM_DIR=/path/to/my/llvm/bin/ might be useful.
>
> CC = $(LLVM_DIR)clang
> LD = $(LLVM_DIR)ld.lld
> ...
>
> However, we can handle this by modifying PATH. So, we decided to not do
> this.
>
> - LLVM_SUFFIX
>
> Some distributions (e.g. Debian) package specific versions of LLVM with
> naming conventions that use the version as a suffix.
>
> CC = clang$(LLVM_SUFFIX)
> LD = ld.lld(LLVM_SUFFIX)
> ...
>
> will allow a user to pass LLVM_SUFFIX=-11 to use clang-11 etc.,
> but the suffixed versions in /usr/bin/ are symlinks to binaries in
> /usr/lib/llvm-#/bin/, so this can also be handled by PATH.
>
> - HOSTCC, HOSTCXX, etc.
>
> We can switch the host compilers in the same way:
>
> ifneq ($(LLVM),)
> HOSTCC = clang
> HOSTCXX = clang++
> else
> HOSTCC = gcc
> HOSTCXX = g++
> endif
>
> This may the right thing to do, but I could not make up my mind.
> Because we do not frequently switch the host compiler, a counter
> solution I had in my mind was to leave it to the default of the
> system.
>
> HOSTCC = cc
> HOSTCXX = c++
>
> Many distributions support update-alternatives to switch the default
> to GCC, Clang, or whatever, but reviewers were opposed to this
> approach. So, this commit does not touch the host tools.

update-alternatives assumes you've installed Clang via a package manager?
$ update-alternatives --list cc
/usr/bin/gcc
On my system even though clang and friends are in my PATH.

And previously, there was feedback that maybe folks don't want to
change `cc` on their systems just for Clang kernel builds.
https://lkml.org/lkml/2020/3/30/836
https://lkml.org/lkml/2020/3/30/838

A goal for ClangBuiltLinux is to build a kernel image with no GCC or
binutils installed on the host. Let the record reflect that. And
there's been multiple complaints that the existing syntax is too long
for specifying all of the tools.

LLVM=1 is meant to be one flag. Not `make LLVM=1 HOSTCC=clang
HOSTCXX=clang`. If folks want fine grain flexibility, use the
existing command line interface, which this patch does not change.
LLVM=1 is opinionated, and inflexible, because it makes a strong
choice to enable LLVM for everything.

Another reason why I don't want to change these over time, and why I
want them all to be in sync is that there are 4 different CI systems
for the kernel, and they are currently fragmented in terms of who is
using what tools:

KernelCI: CC=clang only
Kbuild test robot aka 0day bot: CC=clang LD=ld.lld
Linaro TCWG: CC=clang only
our CI: a complete mix due to combinatorial explosion, but more
coverage of LLVM than everyone else.

That is a mess that we must solve. Having 1 flag that works
consistently across systems is one solution. Now if those were all
using LLVM=1, but some were enabling Clang's integrated assembler, and
some weren't because we changed the default over time, then we'd be
right back to this mismatch between systems. I'd much rather draw the
line in the sand, and say "this is how this flag will work, since day
1." Maybe it's too rigid, but it's important to me that if we create
something new to solve multiple objectives (1. simplifies existing
interface. 2. turns on everything.) that it does so. It is a partial
solution, if it eliminates some of the flags while leaving others. I
want a full solution.

If folks want the flexibility to mix and match tools, the existing
interface is capable. But for us to track who is using what, we need
1 flag that we know is not different depending on the cc of the
system. Once clang's integrated assembler is good to go, we will
begin recommending LLVM=1 to everyone. And we want feedback if we
regress building the host utilities during a kernel build, even if
there are not many.

I'm on the fence about having all of the above satisfied by one patch,
or taking this patch as is and following up on the above two points
(related to disabling `-no-integrated-as` and setting HOSTCC). I
trust your judgement and respect your decisions, so I'll defer to you
Masahiro, but I need to make explicit the design goals. Maybe with
this additional context it can help inform the design.
Tested-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>


Thanks for the comments.

I'd rather want to do this incrementally,
making sure I am doing right.


The meaning of LLVM=1 may change over time.
It means "the recommended settings" at the moment.

If CI does not want to change the behavior across
kernel versions, it can pass individual variables
explicitly.

--
Best Regards
Masahiro Yamada

--
You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@xxxxxxxxxxxxxxxxx
To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAK7LNAQybfcYiosNU%2Bybd-Q7-Y2dbLqBVN2XA00wCRnFAoqdew%40mail.gmail.com.