Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls

From: Fangrui Song
Date: Fri Jun 04 2021 - 19:52:11 EST


On 2021-06-04, 'Nick Desaulniers' via Clang Built Linux wrote:
On Fri, Jun 4, 2021 at 1:50 PM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:

(Manually replying to https://lore.kernel.org/lkml/CAFJ_xbq06nfaEWtVNLtg7XCJrQeQ9wCs4Zsoi5Y_HP3Dx0iTRA@xxxxxxxxxxxxxx/)

Hi Peter,
We're also tracking 2 recent regressions that look like they've come from this
patch.

https://github.com/ClangBuiltLinux/linux/issues/1384
https://github.com/ClangBuiltLinux/linux/issues/1388

(Both in linux-next at the moment).

The first, it looks like a boot failure. The second is a warning from the
linker on a kernel module; even readelf seems unhappy with the results of the
output from objtool.

I can more easily reproduce the latter, so I'm working on getting a smaller
reproducer. I'll let you know when I have it, but wanted to report it ASAP.

Sent a pretty big attachment, privately. I was able to capture the
before/after with:

$ $ echo 'CONFIG_GCOV_KERNEL=n
CONFIG_KASAN=n
CONFIG_LTO_CLANG_THIN=y' >allmod.config
$ OBJTOOL_ARGS="--backup" make -kj"$(nproc)" KCONFIG_ALLCONFIG=1
LLVM=1 LLVM_IAS=1 all

It looks like

$ ./tools/objtool/objtool orc generate --module --no-fp
--no-unreachable --retpoline --uaccess --mcount
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o; ld.lld -r -m elf_x86_64
-plugin-opt=-code-model=kernel -plugin-opt=-stack-alignment=8
--thinlto-cache-dir=.thinlto-cache -mllvm -import-instr-limit=5
-plugin-opt=-warn-stack-size=2048 --build-id=sha1 -T
scripts/module.lds -o drivers/gpu/drm/amd/amdgpu/amdgpu.ko
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o
drivers/gpu/drm/amd/amdgpu/amdgpu.mod.o

is producing the linker error:

ld.lld: error: drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o:
SHT_SYMTAB_SHNDX has 79581 entries, but the symbol table associated
has 79582

Readelf having issues with the output:
$ readelf -s amdgpu.lto.o.orig
<works fine>
$ readelf -s amdgpu.lto.o
readelf: Error: Reading 73014451695 bytes extends past end of file for
string table
$ llvm-readelf -s amdgpu.lto.o
llvm-readelf: error: 'amdgpu.lto.o': unable to continue dumping, the
file is corrupt: section table goes past the end of file

`file` having issues:
$ file drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o: ELF 64-bit LSB relocatable,
x86-64, version 1 (SYSV), no section header

for comparison:
$ file ./drivers/spi/spi-ath79.lto.o
./drivers/spi/spi-ath79.lto.o: ELF 64-bit LSB relocatable, x86-64,
version 1 (SYSV), not stripped

tools/objtool/elf.c:elf_add_symbol may not update .symtab_shndx .
Speaking of llvm-objcopy, it finalizes the content of .symtab_shndx when .symtab
is finalized. objtool may want to adopt a similar approach.

read_symbols searches for the section ".symtab_shndx". It'd be better to
use the section type SHT_SYMTAB_SHNDX.