[RFC v2 0/6] x86: dynamic indirect branch promotion

From: Nadav Amit
Date: Mon Dec 31 2018 - 02:20:53 EST


This is a revised version of optpolines (formerly named retpolines) for
dynamic indirect branch promotion in order to reduce retpoline overheads
[1].

This version address some of the concerns that were raised before.
Accordingly, the code was slightly simplified and patching is now done
using the regular int3/breakpoint mechanism.

Outline optpolines for multiple targets was added. I do not think the
way I implemented it is the correct one. In my original (private)
version, if there are more targets than the outline block can hold, the
outline block is completely removed. However, I think this is
more-or-less how Josh wanted it to be.

The code modifications are now done using a gcc-plugin. This allows to
easily ignore code from init and other code sections. I think it should
also allow us to add opt-in/opt-out support for each branch, for example
by marking function pointers using address-space attributes.

All of these changes required some optimizations to go away to keep the
code simple. I have still did not run the benchmarks again.

So I might have not addressed all the open issues, but it is rather hard
to finish the implementation since some still open high-level decisions
affect the way in which optimizations should be done.

Specifically:

- Is it going to be the only indirect branch promotion mechanism? If so,
it probably should also provide interface similar to Josh's
"static-calls" with annotations.

- Should it also be used when retpolines are disabled (in the config)?
This does complicate the implementation a bit (RFC v1 supported it).

- Is it going to be opt-in or opt-out? If it is an opt-out mechanism,
memory and performance optimizations need to be more aggressive.

- Do we use periodic learning or not? Josh suggested to reconfigure the
branches whenever a new target is found. However, I do not know at
this time how to do learning efficiently, without making learning much
more expensive.

[1] https://lore.kernel.org/patchwork/cover/1001332/

Nadav Amit (6):
x86: introduce kernel restartable sequence
objtool: ignore instructions
x86: patch indirect branch promotion
x86: interface for accessing indirect branch locations
x86: learning and patching indirect branch targets
x86: outline optpoline

arch/x86/Kconfig | 4 +
arch/x86/entry/entry_64.S | 16 +-
arch/x86/include/asm/nospec-branch.h | 83 ++
arch/x86/include/asm/sections.h | 2 +
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/asm-offsets.c | 9 +
arch/x86/kernel/nospec-branch.c | 1293 ++++++++++++++++++
arch/x86/kernel/traps.c | 7 +
arch/x86/kernel/vmlinux.lds.S | 7 +
arch/x86/lib/retpoline.S | 83 ++
include/linux/cpuhotplug.h | 1 +
include/linux/module.h | 9 +
kernel/module.c | 8 +
scripts/Makefile.gcc-plugins | 3 +
scripts/gcc-plugins/x86_call_markup_plugin.c | 329 +++++
tools/objtool/check.c | 21 +-
16 files changed, 1872 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/kernel/nospec-branch.c
create mode 100644 scripts/gcc-plugins/x86_call_markup_plugin.c

--
2.17.1