[PATCH v4] x86/entry: emit a symbol for register restoring thunk

From: Nick Desaulniers
Date: Tue Jan 12 2021 - 14:47:42 EST


Arnd found a randconfig that produces the warning:

arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
offset 0x3e

when building with LLVM_IAS=1 (Clang's integrated assembler). Josh
notes:

With the LLVM assembler not generating section symbols, objtool has no
way to reference this code when it generates ORC unwinder entries,
because this code is outside of any ELF function.

The limitation now being imposed by objtool is that all code must be
contained in an ELF symbol. And .L symbols don't create such symbols.

So basically, you can use an .L symbol *inside* a function or a code
segment, you just can't use the .L symbol to contain the code using a
SYM_*_START/END annotation pair.

Fangrui notes that this optimization is helpful for reducing image size
when compiling with -ffunction-sections and -fdata-sections. I have
observed on the order of tens of thousands of symbols for the kernel
images built with those flags.

A patch has been authored against GNU binutils to match this behavior,
so this will also become a problem for users of GNU binutils once they
upgrade to 2.36.

We can omit the .L prefix on a label so that the assembler will emit an
entry into the symbol table for the label, with STB_LOCAL binding. This
enables objtool to generate proper unwind info here with LLVM_IAS=1 or
GNU binutils 2.36+.

Cc: Fangrui Song <maskray@xxxxxxxxxx>
Link: https://github.com/ClangBuiltLinux/linux/issues/1209
Link: https://reviews.llvm.org/D93783
Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
Reported-by: Arnd Bergmann <arnd@xxxxxxxx>
Suggested-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Suggested-by: Borislav Petkov <bp@xxxxxxxxx>
Signed-off-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
---
Changes v3 -> v4:
* Add changes to Documentation/ and include/ as per Boris.
* Fix typos as per Josh.
* Replace link and note in commit message about
--generate-unused-section-symbols=[yes|no] which was dropped from GNU
binutils with link to actual commit in binutils-gdb.
* Add additional notes from Josh in commit message.
* Slightly reword commit message to indicate that section symbols are
not emitted, rather than stripped.

Changes v2 -> v3:
* rework to use STB_LOCAL rather than STB_GLOBAL by dropping .L prefix,
as per Josh.
* rename oneline to drop STB_GLOBAL in commit message.
* add link to GAS docs on .L prefix.
* drop Josh's ack since patch changed.

Changes v1 -> v2:
* Pick up Josh's Ack.
* Add commit message info about -ffunction-sections/-fdata-sections, and
link to binutils patch.

Documentation/asm-annotations.rst | 9 +++++++++
arch/x86/entry/thunk_64.S | 8 ++++----
include/linux/linkage.h | 5 ++++-
3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/Documentation/asm-annotations.rst b/Documentation/asm-annotations.rst
index 32ea57483378..e711ff98102a 100644
--- a/Documentation/asm-annotations.rst
+++ b/Documentation/asm-annotations.rst
@@ -153,6 +153,15 @@ This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
To some extent, this category corresponds to deprecated ``ENTRY`` and
``END``. Except ``END`` had several other meanings too.

+ Developers should avoid using local symbol names that are prefixed with
+ ``.L``, as this has special meaning for the assembler; a symbol entry will
+ not be emitted into the symbol table. This can prevent ``objtool`` from
+ generating correct unwind info. Symbols with STB_LOCAL binding may still be
+ used, and ``.L`` prefixed local symbol names are still generally useable
+ within a function, but ``.L`` prefixed local symbol names should not be used
+ to denote the beginning or end of code regions via
+ ``SYM_CODE_START_LOCAL``/``SYM_CODE_END``.
+
* ``SYM_INNER_LABEL*`` is used to denote a label inside some
``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``. They are very similar
to C labels, except they can be made global. An example of use::
diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index ccd32877a3c4..c9a9fbf1655f 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -31,7 +31,7 @@ SYM_FUNC_START_NOALIGN(\name)
.endif

call \func
- jmp .L_restore
+ jmp __thunk_restore
SYM_FUNC_END(\name)
_ASM_NOKPROBE(\name)
.endm
@@ -44,7 +44,7 @@ SYM_FUNC_END(\name)
#endif

#ifdef CONFIG_PREEMPTION
-SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
+SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore)
popq %r11
popq %r10
popq %r9
@@ -56,6 +56,6 @@ SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
popq %rdi
popq %rbp
ret
- _ASM_NOKPROBE(.L_restore)
-SYM_CODE_END(.L_restore)
+ _ASM_NOKPROBE(__thunk_restore)
+SYM_CODE_END(__thunk_restore)
#endif
diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 5bcfbd972e97..11537ba9f512 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -270,7 +270,10 @@
SYM_END(name, SYM_T_FUNC)
#endif

-/* SYM_CODE_START -- use for non-C (special) functions */
+/*
+ * SYM_CODE_START -- use for non-C (special) functions, avoid .L prefixed local
+ * symbol names which may not emit a symbol table entry.
+ */
#ifndef SYM_CODE_START
#define SYM_CODE_START(name) \
SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN)
--
2.30.0.284.gd98b1dd5eaa7-goog