Re: [PATCH v2] x86/pgtable: Always inline p4d_index

From: Thomas Gleixner
Date: Wed Jan 16 2019 - 16:54:45 EST


On Wed, 16 Jan 2019, Nathan Chancellor wrote:

> When building an allyesconfig build with Clang, the kernel fails to link
> arch/x86/platform/efi/efi_64.o because of a failed BUILD_BUG_ON:
>
> ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':
> (.text+0x8e5): undefined reference to `__compiletime_assert_277'
>
> Since there are several BUILD_BUG_ONs in efi_64.c, I isolated it down to
> the following:
>
> BUILD_BUG_ON(p4d_index(EFI_VA_END) != p4d_index(MODULES_END));
>
> After some research, it turns out that there is a new config option
> called NO_AUTO_INLINE, which adds '-fno-inline-functions' to
> KBUILD_CFLAGS so that the compiler does not auto inline small functions,
> which causes this BUILD_BUG_ON to fail because p4d_index is no longer an
> integer constant expression.

There is no tree where this config option exists.

> According to the help text of the config, functions explicitly marked
> inline should still be inlined. As it turns out, GCC and Clang both
> support '-fno-inline-functions' but Clang only inlines functions when
> they are marked with an always inline annotation[1].
>
> Since it's expected that p4d_index should always be inlined so that its
> value can be evaluated at build time, mark it as __always_inline.

Sorry no, that's just papering over the problem.

The point is that NO_AUTO_INLINE is/was meant to prevent the compiler from
automatically inlining functions, which are NOT marked inline in any form
in order to expand the traceability.

With GCC this makes sense, because it still inlines functions which are
solely marked inline and even if it decides not to inline them it will
evaluate them if they expand to a build time constant expression.

Now Clang decided to give -fno-inline-functions a different meaning,
i.e. the same as GCC has for -fno-inline, which prevents inlining for
everything except functions which are marked __always_inline.

So just "fixing" the build wreckage by duct taping one single instance of
inline functions is really a bad idea. The resulting kernel will be
bloatware because all regular inlines and we have tons of them will turn
into function calls even if the function overhead is larger than the
resulting inline code. Not to talk about the performance impact.

Anyway, as this option is found to be nowhere there is no point to apply
this patch.

Thanks,

tglx