Re: mainline build failure due to 281d0c962752 ("fortify: Add Clang support")

From: Nick Desaulniers
Date: Wed Jun 22 2022 - 18:40:38 EST

Next message: kernel test robot: "[ammarfaizi2-block:kvalo/ath/ath12k-bringup-extras 2/3] drivers/net/wireless/ath/ath12k/pci.c:678:7: error: call to undeclared function 'of_property_read_u32'; ISO C99 and later do not support implicit function declarations"
Previous message: Ira Weiny: "Re: [PATCH V11 3/8] PCI: Create PCI library functions in support of DOE mailboxes."
In reply to: Linus Torvalds: "Re: mainline build failure due to 281d0c962752 ("fortify: Add Clang support")"
Next in thread: David Laight: "RE: mainline build failure due to 281d0c962752 ("fortify: Add Clang support")"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

+ llvm list, moving net and net-folk to bcc. Follow along on lore if
still interested.

On Wed, Jun 22, 2022 at 10:49 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jun 22, 2022 at 12:26 PM Sudip Mukherjee
> <sudipm.mukherjee@xxxxxxxxx> wrote:
> >
> > Tried it after applying your patch. There was no build failure, but some warnings:
>
> So some of those objtool warnings are, I think, because clang does odd
> and crazy things for when it decides "this is not reachable" code.
>
> I don't much like it, and neither does objtool, but it is what it is.
> When clang decides "I'm calling a function that cannot return", it
> will have a "call" instruction and then it will just fall off the face
> of the earth after that.
>
> That includes falling through to the next function, or just to random
> other labels after the function, and then objtool as a result
> complains about a stack state mismatch (when the fallthrough is the
> same function, but now the stack pointer is different in different
> parts), or of the "falls through to next function".
>
> I think it's a clang misfeature in that if something goes wrong, you
> basically execute random code. I'd much rather see clang insert a real
> 'ud' instruction or 'int3' or whatever. But it doesn't.

So adding `-mllvm -trap-unreachable` will turn these
`__builtin_unreachable()`'s into trapping instructions. I think we
should just do that/enable that in the kernel. The following patch
eliminates ALL of the fallthrough warnings observed from objtool on
x86_64 defconfig builds.

```
diff --git a/scripts/Makefile.clang b/scripts/Makefile.clang
index 87285b76adb2..1fbf8a8f3751 100644
--- a/scripts/Makefile.clang
+++ b/scripts/Makefile.clang
@@ -36,6 +36,7 @@ endif
# so they can be implemented or wrapped in cc-option.
CLANG_FLAGS += -Werror=unknown-warning-option
CLANG_FLAGS += -Werror=ignored-optimization-argument
+CLANG_FLAGS += -mllvm -trap-unreachable
KBUILD_CFLAGS += $(CLANG_FLAGS)
KBUILD_AFLAGS += $(CLANG_FLAGS)
export CLANG_FLAGS
```

There's more I need to do for LTO; `-mllvm` flags need to be passed to
the linker in that case. Let me do a few more builds, collect
statistics on build size differences (guessing neglidgeable), then
will send out a more formal patch.

>
> I didn't check whether gcc avoids that "don't make assumptions about
> non-return functions" or whether it's just that objtool recognizes
> whatever pattern gcc uses.
>
> So *some* of the warnings are basically code generation disagreements,
> but aren't signs of actual problems per se.
>
> Others may be because objdump knows about gcc foibles in ways it
> doesn't know about some clang foibles. I think the "call to memcpy()
> leaves .noinstr.text section" might be of that type: clang seems to
> sometimes generate out-of-line memcpy calls for absolutely ridiculous
> things (I've seen a 16-byte memcpy done that way - just stupid when
> it's just two load/store pairs, and just the function call overhead is
> much more than that).

IIRC, that was from CONFIG_KASAN.

Looking at the disassembly (llvm-objdump's `--disassemble-symbols=`
flag is handy) of the following from an allmodconfig build:

vmlinux.o: warning: objtool: sync_regs+0x24: call to memcpy() leaves
.noinstr.text section

168B memcpy

vmlinux.o: warning: objtool: vc_switch_off_ist+0xbe: call to memcpy()
leaves .noinstr.text section

168B memcpy

vmlinux.o: warning: objtool: fixup_bad_iret+0x36: call to memset()
leaves .noinstr.text section

168B memset

vmlinux.o: warning: objtool: __sev_get_ghcb+0xa0: call to memcpy()
leaves .noinstr.text section

4096B memcpy

vmlinux.o: warning: objtool: __sev_put_ghcb+0x35: call to memcpy()
leaves .noinstr.text section

4096B memcpy

So it doesn't seem like it's the same issue of "dump memcpy of small
`n` that we'd seen previously). I suspect that objtool's assumption
that the compiler can't turn assignments into libcalls is...compiler
specific.

Replying to earlier points in the thread now:

On Wed, Jun 22, 2022 at 9:21 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jun 22, 2022 at 10:02 AM Nathan Chancellor <nathan@xxxxxxxxxx> wrote:
> >
> > Right, we are working on a statically linked and optimized build of LLVM
> > that people can use similar to the GCC builds provided on kernel.org,
> > which should make the compile time problem not as bad as well as making
> > it easier for developers to get access to a recent version of clang with
> > all the fixes and improvements that we have made in upstream LLVM.
>
> So I'm on the road, and will try to see how well I can do that
> allmodconfig build on my poor laptop and see what else goes wrong for
> now.
>
> But I do have to say that it's just a lot better if the regular distro
> compiler build works out of the box. I did build my own clang binary
> for a while, just because I was an early adopter of the whole "ask
> goto with outputs" thing, but I've been *so* much happier now that I
> don't need to do that any more.
>
> So I would prefer not going backwards.

I agree.

> I wish the standard clang build just stopped doing the crazy shared
> library thing. The security arguments for it are just ridiculous, when
> any shared library update for any security reason would mean a full
> clang update _anyway_.

Regarding the "security" argument against the use of shared libraries;
I agree with you. If the compiler will just crash when given five
open parenthesis as inputs, it never was designed for untrusted input
in the first place.

That said, if I had to host executables and libraries for download,
perhaps it would be a smaller bill to serve a bunch of libraries over
fully statically linked (and thus larger) executables.

I think it's too late and a non-starter to suggest removing the
ability to build libllvm.so or libclang.so to the LLVM community at
this point though.

>
> I realize it's partly distro rules too, but the distros only do that
> "we always build shared libraries" when the project itself makes that
> an option. And it's a really stupid option for the regular C compiler
> case.

Right, in that case our hands are somewhat tied. They're not our
distributions. Even if we do our own builds/distributions of clang,
you can lead a horse to water...

That said, I think we can can help distros better configure their
builds. For instance, a sweet spot might be to statically link clang,
but dynamically link all of the GNU-binutils-like substitutes. Those
are seldom invoked and not the bottleneck in any profile. LLVM's
cmake doesn't have an option to do that easily though; we should
provide one then recommend distros use it. Make it easy to do the
right thing.

>
> Side note: I think gcc takes almost exactly the opposite approach, and
> does much better as a result. It doesn't do a shared libary, but what
> it *does* do is make 'gcc' itself a reasonably small binary that just
> does the command line front-end parsing.
>
> The advantage of the gcc model is that it works *really* well for the
> "test compiler options" kind of stage, where you only run that much
> simpler 'gcc' wrapper binary.
>
> We don't actually take full advantage of that, because we do end up
> doing a real "build" of an empty file, so "cc1" does actually get
> executed, but even then it's done fairly efficiently with 'vfork()'.
> That "cc-option" part of the kernel build is actually noticeable
> during configuration etc, and clang is *much* slower because of how it
> is built.
>
> See
>
> time clang -Werror -c -x c /dev/null
>
> and compare it with gcc. And if you want to see a really *big*
> difference, pick a command line option that causes an error because it
> doesn't exist..

Looking at a profile, there's a lot of stupid stuff we're doing. We
can probably get faster "at doing nothing." See
https://gist.github.com/nickdesaulniers/81a87ffa784c13d0bf60f60b1d54651b
for the profile and my commentary/initial thoughts.

>
> I really wish clang wasn't so much noticeably slower. It's limiting
> what I do with it, and I've had other developers say the same.

We can do better. I'll keep pushing on this up my chain of command.
That statement stands in stark contrast to the below:

On Wed, Jun 22, 2022 at 6:47 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I build the kernel I actually _use_ with clang, and make sure it's
> clean in sane configurations, but my full allmodconfig build I do with
> gcc.

A fantastic and motivational endorsement for the hard work we put in,
which is why it would be a travesty if build times and allmodconfig
hygiene caused us to lose your support.

>
> Partly because of that "the clang build hasn't quite gotten there yet"
> and partly because last I tried it was even slower to build (not a big
> issue for my default config, but does matter for the allmodconfig
> build, even on my beefy home machine)
>
> I would love for people to start doing allmodconfig builds with clang
> too, but it would require some initial work to fix it... Hint, hint.

As Nathan notes, we've been working on it. Long tail that's constantly
regressing, but WIP nonetheless.

>
> And in the case of this warning attribute case, the clang error messages are
>
> (a) verbose
>
> (b) useless
>
> because they point to where the warning attribute is (I know where it
> is), but don't point to where it's actually triggering (ie where it
> was actually inlined and called from).
>
> The gcc equivalent of that warning actually says exactly where the
> problem is. The clang one is useless, which is probably part of why
> people aren't fixing them, because even if they would want to, they
> just give up.
>
> Nick, Nathan, any chance of getting better error messages out of
> clang? In some cases, they are very good, so it's not like clang does
> bad error messages by default. But in this case, the error message
> really is *entirely* useless.

Yeah, it's definitely not helpful in its current form. I'll have to
think a bit more about how we can retain and display inlining
decisions, which is what's necessary here to make the diagnostic
actionable.

Building with `KCFLAGS=-Rpass=inline` does provide some hints, but
also a lot of unhelpful noise:

```
$ make LLVM=1 -j72 drivers/net/ethernet/huawei/hinic/hinic_devlink.o
KCFLAGS=-Rpass=inline
...
drivers/net/ethernet/huawei/hinic/hinic_devlink.c:46:3: remark:
'_Z18fortify_memcpy_chkmmmmmPKc' inlined into 'check_image_valid':
always inline attribute at callsite check_image_valid:23:3;
[-Rpass=inline]
memcpy(&host_image->image_section_info[i],
^
...
```
AFAIK, the current architecture of LLVM doesn't retain inlining
decisions made, so clang can point to the definition of a function
that shouldn't have been called (one annotated w/
__attribute__((error(""))) or __attribute__((warning("")))) but it
can't tell you which call site specifically was problematic. There's
similarly unhelpful diagnostics sometimes with inline asm that feels
vaguely reminiscent I document here:
https://github.com/ClangBuiltLinux/linux/issues/1571#issuecomment-1135199630.

As to _why_ clang isn't getting this object size correct, I wasn't
able to find out today, but will keep digging. Stay tuned.
https://github.com/ClangBuiltLinux/linux/issues/1592
--
Thanks,
~Nick Desaulniers

Next message: kernel test robot: "[ammarfaizi2-block:kvalo/ath/ath12k-bringup-extras 2/3] drivers/net/wireless/ath/ath12k/pci.c:678:7: error: call to undeclared function 'of_property_read_u32'; ISO C99 and later do not support implicit function declarations"
Previous message: Ira Weiny: "Re: [PATCH V11 3/8] PCI: Create PCI library functions in support of DOE mailboxes."
In reply to: Linus Torvalds: "Re: mainline build failure due to 281d0c962752 ("fortify: Add Clang support")"
Next in thread: David Laight: "RE: mainline build failure due to 281d0c962752 ("fortify: Add Clang support")"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]