Re: [tip:x86/pti] x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro

From: Denys Vlasenko
Date: Mon Feb 12 2018 - 08:29:38 EST


On 02/12/2018 11:17 AM, tip-bot for Dominik Brodowski wrote:
Commit-ID: 7b7b09f110f06f3c006e9b3f4590f7d9ba91345b
Gitweb: https://git.kernel.org/tip/7b7b09f110f06f3c006e9b3f4590f7d9ba91345b
Author: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
AuthorDate: Sun, 11 Feb 2018 11:49:45 +0100
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 12 Feb 2018 08:06:36 +0100

x86/entry/64: Introduce the PUSH_AND_CLEAN_REGS macro

Those instances where ALLOC_PT_GPREGS_ON_STACK is called just before
SAVE_AND_CLEAR_REGS can trivially be replaced by PUSH_AND_CLEAN_REGS.
This macro uses PUSH instead of MOV and should therefore be faster, at
least on newer CPUs.

Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Dominik Brodowski <linux@xxxxxxxxxxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: dan.j.williams@xxxxxxxxx
Link: http://lkml.kernel.org/r/20180211104949.12992-5-linux@xxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/entry/calling.h | 36 ++++++++++++++++++++++++++++++++++++
arch/x86/entry/entry_64.S | 6 ++----
2 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index a05cbb8..57b1b87 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -137,6 +137,42 @@ For 32-bit we have the following conventions - kernel is built with
UNWIND_HINT_REGS offset=\offset
.endm
+ .macro PUSH_AND_CLEAR_REGS
+ /*
+ * Push registers and sanitize registers of values that a
+ * speculation attack might otherwise want to exploit. The
+ * lower registers are likely clobbered well before they
+ * could be put to use in a speculative execution gadget.
+ * Interleave XOR with PUSH for better uop scheduling:
+ */
+ pushq %rdi /* pt_regs->di */
+ pushq %rsi /* pt_regs->si */
+ pushq %rdx /* pt_regs->dx */
+ pushq %rcx /* pt_regs->cx */
+ pushq %rax /* pt_regs->ax */
+ pushq %r8 /* pt_regs->r8 */
+ xorq %r8, %r8 /* nospec r8 */

xorq's are slower than xorl's on Silvermont/Knights Landing.
I propose using xorl instead.

+ pushq %r9 /* pt_regs->r9 */
+ xorq %r9, %r9 /* nospec r9 */
+ pushq %r10 /* pt_regs->r10 */
+ xorq %r10, %r10 /* nospec r10 */
+ pushq %r11 /* pt_regs->r11 */
+ xorq %r11, %r11 /* nospec r11*/
+ pushq %rbx /* pt_regs->rbx */
+ xorl %ebx, %ebx /* nospec rbx*/
+ pushq %rbp /* pt_regs->rbp */
+ xorl %ebp, %ebp /* nospec rbp*/
+ pushq %r12 /* pt_regs->r12 */
+ xorq %r12, %r12 /* nospec r12*/
+ pushq %r13 /* pt_regs->r13 */
+ xorq %r13, %r13 /* nospec r13*/
+ pushq %r14 /* pt_regs->r14 */
+ xorq %r14, %r14 /* nospec r14*/
+ pushq %r15 /* pt_regs->r15 */
+ xorq %r15, %r15 /* nospec r15*/