Re: Very slow clang kernel config ..

From: Linus Torvalds
Date: Fri Apr 30 2021 - 21:32:37 EST


On Fri, Apr 30, 2021 at 5:25 PM Nick Desaulniers
<ndesaulniers@xxxxxxxxxx> wrote:
>
> Ah, no, sorry, these are the runtime link editor/loader. So probably
> spending quite some time resolving symbols in large binaries.

Yeah. Appended is the profile I see when I profile that "make
oldconfig", so about 45% of all time seems to be spent in just symbol
lookup and relocation.

And a fair amount of time just creating and tearing down that huge
executable (with a lot of copy-on-write overhead too), with the kernel
side of that being another 15%. The cost of that is likely also fairly
directly linked to all the dynamic linking costs, which brings in all
that data.

Just to compare, btw, this is the symbol lookup overhead for the gcc case:

1.43% ld-2.33.so do_lookup_x
0.96% ld-2.33.so _dl_relocate_object
0.69% ld-2.33.so _dl_lookup_symbol_x

so it really does seem to be something very odd going on with the clang binary.

Maybe the Fedora binary is built some odd way, but it's likely just
the default clang build.

Linus

----
23.59% ld-2.33.so _dl_lookup_symbol_x
11.41% ld-2.33.so _dl_relocate_object
9.95% ld-2.33.so do_lookup_x
4.00% [kernel.vmlinux] copy_page
3.98% [kernel.vmlinux] next_uptodate_page
3.05% [kernel.vmlinux] zap_pte_range
1.81% [kernel.vmlinux] clear_page_rep
1.68% [kernel.vmlinux] asm_exc_page_fault
1.33% ld-2.33.so strcmp
1.33% ld-2.33.so check_match
0.92% libLLVM-12.so llvm::StringMapImpl::LookupBucketFor
0.83% [kernel.vmlinux] rmqueue_bulk
0.77% conf yylex
0.75% libc-2.33.so __gconv_transform_utf8_internal
0.74% libc-2.33.so _int_malloc
0.69% libc-2.33.so __strlen_avx2
0.62% [kernel.vmlinux] pagecache_get_page
0.58% [kernel.vmlinux] page_remove_rmap
0.56% [kernel.vmlinux] __handle_mm_fault
0.54% [kernel.vmlinux] filemap_map_pages
0.54% libc-2.33.so __strcmp_avx2
0.54% [kernel.vmlinux] __free_one_page
0.52% [kernel.vmlinux] release_pages