[RFC v2 0/6] x86: dynamic indirect branch promotion
From: Nadav Amit
Date: Mon Dec 31 2018 - 02:20:53 EST
This is a revised version of optpolines (formerly named retpolines) for
dynamic indirect branch promotion in order to reduce retpoline overheads
This version address some of the concerns that were raised before.
Accordingly, the code was slightly simplified and patching is now done
using the regular int3/breakpoint mechanism.
Outline optpolines for multiple targets was added. I do not think the
way I implemented it is the correct one. In my original (private)
version, if there are more targets than the outline block can hold, the
outline block is completely removed. However, I think this is
more-or-less how Josh wanted it to be.
The code modifications are now done using a gcc-plugin. This allows to
easily ignore code from init and other code sections. I think it should
also allow us to add opt-in/opt-out support for each branch, for example
by marking function pointers using address-space attributes.
All of these changes required some optimizations to go away to keep the
code simple. I have still did not run the benchmarks again.
So I might have not addressed all the open issues, but it is rather hard
to finish the implementation since some still open high-level decisions
affect the way in which optimizations should be done.
- Is it going to be the only indirect branch promotion mechanism? If so,
it probably should also provide interface similar to Josh's
"static-calls" with annotations.
- Should it also be used when retpolines are disabled (in the config)?
This does complicate the implementation a bit (RFC v1 supported it).
- Is it going to be opt-in or opt-out? If it is an opt-out mechanism,
memory and performance optimizations need to be more aggressive.
- Do we use periodic learning or not? Josh suggested to reconfigure the
branches whenever a new target is found. However, I do not know at
this time how to do learning efficiently, without making learning much
Nadav Amit (6):
x86: introduce kernel restartable sequence
objtool: ignore instructions
x86: patch indirect branch promotion
x86: interface for accessing indirect branch locations
x86: learning and patching indirect branch targets
x86: outline optpoline
arch/x86/Kconfig | 4 +
arch/x86/entry/entry_64.S | 16 +-
arch/x86/include/asm/nospec-branch.h | 83 ++
arch/x86/include/asm/sections.h | 2 +
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/asm-offsets.c | 9 +
arch/x86/kernel/nospec-branch.c | 1293 ++++++++++++++++++
arch/x86/kernel/traps.c | 7 +
arch/x86/kernel/vmlinux.lds.S | 7 +
arch/x86/lib/retpoline.S | 83 ++
include/linux/cpuhotplug.h | 1 +
include/linux/module.h | 9 +
kernel/module.c | 8 +
scripts/Makefile.gcc-plugins | 3 +
scripts/gcc-plugins/x86_call_markup_plugin.c | 329 +++++
tools/objtool/check.c | 21 +-
16 files changed, 1872 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/kernel/nospec-branch.c
create mode 100644 scripts/gcc-plugins/x86_call_markup_plugin.c