Re: [PATCH 10/13] kbuild: build modules in the same way with/without Clang LTO

From: Masahiro Yamada
Date: Thu Aug 19 2021 - 04:12:46 EST


On Thu, Aug 19, 2021 at 3:59 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> On Thu, Aug 19, 2021 at 09:57:41AM +0900, Masahiro Yamada wrote:
> > When Clang LTO is enabled, additional intermediate files *.lto.o are
> > created because LLVM bitcode must be converted to ELF before modpost.
> >
> > For non-LTO builds:
> >
> > $(LD) $(LD)
> > objects ---> <modname>.o -----> <modname>.ko
> > |
> > <modname>.mod.o ---/
> >
> > For Clang LTO builds:
> >
> > $(AR) $(LD) $(LD)
> > objects ---> <modname>.o ---> <modname>.lto.o -----> <modname>.ko
> > |
> > <modname>.mod.o --/
> >
> > Since the Clang LTO introduction, ugly CONFIG_LTO_CLANG conditionals
> > are sprinkled everywhere in the kbuild code.
> >
> > Another confusion for Clang LTO builds is, <modname>.o is an archive
> > that contains LLVM bitcode files. The suffix should have been .a
> > instead of .o
> >
> > To clean up the code, unify the build process of modules, as follows:
> >
> > $(AR) $(LD) $(LD)
> > objects ---> <modname>.a ---> <modname>.prelink.o -----> <modname>.ko
> > |
> > <modname>.mod.o ------/
> >
> > Here, 'objects' are either ELF or LLVM bitcode. <modname>.a is an archive,
> > <modname>.prelink.o is ELF.
>
> I like this design, but I do see that it has a small but measurable
> impact on build times:
>
> allmodconfig build, GCC:
>
> make -j72 allmodconfig
> make -j72 -s clean && time make -j72
>
> kbuild/for-next:
> 6m16.140s
> 6m19.742s
> 6m15.848s
>
> +this-series:
> 6m22.742s
> 6m20.589s
> 6m19.911s
>
> Thought with not so many modules, it's within the noise:
>
> defconfig build, GCC:
>
> make -j72 defconfig
> make -j72 -s clean && time make -j72
>
> kbuild/for-next:
> 0m41.579s
> 0m41.214s
> 0m41.370s
>
> +series:
> 0m41.423s
> 0m41.434s
> 0m41.384s
>
>
> However, I do see that even LTO builds are slightly slower now, so
> perhaps the above numbers aren't due to the added $(AR) step:
>
> allmodconfig + Clang ThinLTO:
>
> make -j72 LLVM=1 LLVM_IAS=1 allmodconfig
> ./scripts/config -d GCOV_KERNEL -d KASAN -d LTO_NONE -e LTO_CLANG_THIN
> make -j72 LLVM=1 LLVM_IAS=1 olddefconfig
> make -j72 -s LLVM=1 LLVM_IAS=1 clean && time make -j72 LLVM=1 LLVM_IAS=1
>
> kbuild/for-next:
> 9m53.927s
> 9m45.874s
> 9m47.722s
>
> +series:
> 9m58.395s
> 9m53.201s
> 9m56.387s



I have not tested this closely, but
perhaps this might be the cost of $(AR) t $<)

In Sami's implementation, *.symversions are merged
by shell command.
Presumably, it runs faster than llvm-ar.
Instead, it has a risk of Argument list too long
as reported in [1].

[1] https://lore.kernel.org/lkml/20210614094948.30023-1-lecopzer.chen@xxxxxxxxxxxx/


Anyway, when I find a time,
I will look into some bench mark.




>
>
> I haven't been able to isolate where the changes in build times are
> coming from (nor have I done link-phase-only timings -- I realize those
> are really the most important).
>
> I did notice some warnings from this patch, though, in the
> $(modules-single) target:
>
> scripts/Makefile.build:434: target 'drivers/scsi/libiscsi.a' given more than once in the same rule
> scripts/Makefile.build:434: target 'drivers/atm/suni.a' given more than once in the same rule


Ah, right.

I also noticed needless rebuilds of prelink.symversions.

In v2, I will fix as follows:


index 957addea830b..cf6b79dff5f9 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -438,6 +438,8 @@ cmd_merge_symver = \
$(obj)/%.prelink.symversions: $(obj)/%.a FORCE
$(call if_changed,merge_symver)

+targets += $(patsubst %.a, %.prelink.symversions, $(modules))
+
$(obj)/%.prelink.o: ld_flags += --script=$(filter %.symversions,$^)
module-symver = $(obj)/%.prelink.symversions

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index f604d2d01cad..5074922db82d 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -107,8 +107,8 @@ real-dtb-y := $(addprefix $(obj)/, $(real-dtb-y))
subdir-ym := $(addprefix $(obj)/,$(subdir-ym))

modules := $(patsubst %.o, %.a, $(obj-m))
-modules-multi := $(patsubst %.o, %.a, $(multi-obj-m))
-modules-single := $(filter-out $(modules-multi), $(filter %.a, $(modules)))
+modules-multi := $(sort $(patsubst %.o, %.a, $(multi-obj-m)))
+modules-single := $(sort $(filter-out $(modules-multi), $(filter %.a,
$(modules))))

# Finds the multi-part object the current object will be linked into.
# If the object belongs to two or more multi-part objects, list them all.








--
Best Regards
Masahiro Yamada