Re: [PATCH] objtool: ignore .L prefixed local symbols

From: Fangrui Song
Date: Thu Feb 13 2020 - 17:37:39 EST


On 2020-02-13, Josh Poimboeuf wrote:
On Thu, Feb 13, 2020 at 10:47:08AM -0800, Nick Desaulniers wrote:
Top of tree LLVM has optimizations related to
-fno-semantic-interposition to avoid emitting PLT relocations for
references to symbols located in the same translation unit, where it
will emit "local symbol" references.

Clang builds fall back on GNU as for assembling, currently. It appears a
bug in GNU as introduced around 2.31 is keeping around local labels in
the symbol table, despite the documentation saying:

"Local symbols are defined and used within the assembler, but they are
normally not saved in object files."

When objtool searches for a symbol at a given offset, it's finding the
incorrectly kept .L<symbol>$local symbol that should have been discarded
by the assembler.

A patch for GNU as has been authored. For now, objtool should not treat
local symbols as the expected symbol for a given offset when iterating
the symbol table.

R_X86_64_PLT32 was fixed (just now) by
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=292676c15a615b5a95bede9ee91004d3f7ee7dfd
It will be included in binutils 2.35 and probably a bug fix release of 2.34.x

commit 644592d32837 ("objtool: Fail the kernel build on fatal errors")
exposed this issue.

Since I'm going to be dropping 644592d32837 ("objtool: Fail the kernel
build on fatal errors") anyway, I wonder if this patch is still needed?

At least the error will be downgraded to a warning. And while the
warning could be more user friendly, it still has value because it
reveals a toolchain bug.

I still consider such a check (tools/objtool/check.c:679) unneeded.

st_type doesn't have to be STT_FUNC. Either STT_NOTYPE or STT_FUNC is
ok. If STT_GNU_IFUNC is used, it can be ok as well.
(My clang patch skips STT_GNU_IFUNC just because rtld typically doesn't
cache R_*_IRELATIVE results. Having two STT_GNU_IFUNC symbols with same st_shndx and
st_value can create two R_*_IRELATIVE, which need to be resolved twice
at runtime.)

} else if (rela->sym->type == STT_SECTION) {
insn->call_dest = find_symbol_by_offset(rela->sym->sec,
rela->addend+4);
if (!insn->call_dest ||
insn->call_dest->type != STT_FUNC) {
WARN_FUNC("can't find call dest symbol at %s+0x%x",
insn->sec, insn->offset,
rela->sym->sec->name,
rela->addend + 4);
return -1;
}


.section .init.text,"ax",@progbits
call printk
call .Lprintk$local
.text
.globl printk
.type printk,@function
printk:
.Lprintk$local:
ret

% llvm-mc -filetype=obj -triple=riscv64 a.s -mattr=+relax -o a.o
% readelf -Wr a.o

Relocation section '.rela.init.text' at offset 0xa0 contains 4 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0000000200000012 R_RISCV_CALL 0000000000000000 printk + 0
0000000000000000 0000000000000033 R_RISCV_RELAX 0
0000000000000008 0000000100000012 R_RISCV_CALL 0000000000000000 .Lprintk$local + 0
0000000000000008 0000000000000033 R_RISCV_RELAX 0


On RISC-V, when relaxation is enabled, .L cannot be resolved at assembly
time because sections can shrink.

https://sourceware.org/binutils/docs/as/Symbol-Names.html

Local symbols are defined and used within the assembler, but they are *normally* not saved in object files.

I consider the GNU as issue a missed optimization, instead of a bug.
There is no rigid rule that .L symbols cannot be saved in object files.