[tip:x86/build] x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs

From: tip-bot for Nadav Amit
Date: Thu Oct 04 2018 - 06:03:59 EST


Commit-ID: 77f48ec28e4ccff94d2e5f4260a83ac27a7f3099
Gitweb: https://git.kernel.org/tip/77f48ec28e4ccff94d2e5f4260a83ac27a7f3099
Author: Nadav Amit <namit@xxxxxxxxxx>
AuthorDate: Wed, 3 Oct 2018 14:30:55 -0700
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 4 Oct 2018 11:24:59 +0200

x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs

As described in:

77b0bf55bc67: ("kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs")

GCC's inlining heuristics are broken with common asm() patterns used in
kernel code, resulting in the effective disabling of inlining.

The workaround is to set an assembly macro and call it from the inline
assembly block - i.e. to macrify the affected block.

As a result GCC considers the inline assembly block as a single instruction.

This patch handles the LOCK prefix, allowing more aggresive inlining:

text data bss dec hex filename
18140140 10225284 2957312 31322736 1ddf270 ./vmlinux before
18146889 10225380 2957312 31329581 1de0d2d ./vmlinux after (+6845)

This is the reduction in non-inlined functions:

Before: 40286
After: 40218 (-68)

Tested-by: Kees Cook <keescook@xxxxxxxxxxxx>
Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20181003213100.189959-6-namit@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/include/asm/alternative-asm.h | 20 ++++++++++++++------
arch/x86/include/asm/alternative.h | 11 ++---------
arch/x86/kernel/macros.S | 1 +
3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h
index 31b627b43a8e..8e4ea39e55d0 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -7,16 +7,24 @@
#include <asm/asm.h>

#ifdef CONFIG_SMP
- .macro LOCK_PREFIX
-672: lock
+.macro LOCK_PREFIX_HERE
.pushsection .smp_locks,"a"
.balign 4
- .long 672b - .
+ .long 671f - . # offset
.popsection
- .endm
+671:
+.endm
+
+.macro LOCK_PREFIX insn:vararg
+ LOCK_PREFIX_HERE
+ lock \insn
+.endm
#else
- .macro LOCK_PREFIX
- .endm
+.macro LOCK_PREFIX_HERE
+.endm
+
+.macro LOCK_PREFIX insn:vararg
+.endm
#endif

/*
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 4cd6a3b71824..d7faa16622d8 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -31,15 +31,8 @@
*/

#ifdef CONFIG_SMP
-#define LOCK_PREFIX_HERE \
- ".pushsection .smp_locks,\"a\"\n" \
- ".balign 4\n" \
- ".long 671f - .\n" /* offset */ \
- ".popsection\n" \
- "671:"
-
-#define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; "
-
+#define LOCK_PREFIX_HERE "LOCK_PREFIX_HERE\n\t"
+#define LOCK_PREFIX "LOCK_PREFIX "
#else /* ! CONFIG_SMP */
#define LOCK_PREFIX_HERE ""
#define LOCK_PREFIX ""
diff --git a/arch/x86/kernel/macros.S b/arch/x86/kernel/macros.S
index f1fe1d570365..852487a9fc56 100644
--- a/arch/x86/kernel/macros.S
+++ b/arch/x86/kernel/macros.S
@@ -8,3 +8,4 @@

#include <linux/compiler.h>
#include <asm/refcount.h>
+#include <asm/alternative-asm.h>