[PATCH v2 0/3] x86/process: Optimize __switch_to_xtra()

From: Kyle Huey
Date: Tue Feb 14 2017 - 03:11:25 EST

GCC generates lousy code in __switch_to_xtra. This patch series is an
updated version of tglx's patches from last year
(https://lkml.org/lkml/2016/12/15/432) that address review comments.

Since v1:
Part 1 - x86/process: Optimize TIF checks in __switch_to_xtra()
- READ_ONCE annotations added as requested by Andy Lutomirski

Part 2 - x86/process: Correct and optimize TIF_BLOCKSTEP switch
- DEBUGCTLMSR_BTF is now modified when either the previous or
next or both tasks use it, because the MSR is "highly magical".

Part 3 - x86/process: Optimize TIF_NOTSC switch
- Unchanged

I didn't introduce a cpufeature for blockstep because that would
add additional overhead compared to the existing code, where it's
generally known at compile time that blockstep is supported. Perhaps
we should just BUG_ON(!arch_has_block_step()) here if we really
care to check anything.

arch/x86/include/asm/msr-index.h | 1 +
arch/x86/include/asm/tlbflush.h | 10 ++++++++++
arch/x86/kernel/process.c | 76 +++++++++++++++++++++++++++++++++++-----------------------------------------
3 files changed, 46 insertions(+), 41 deletions(-)