Re: [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks

From: Denys Vlasenko
Date: Sat Jan 10 2015 - 15:14:39 EST


On Sat, Jan 10, 2015 at 3:23 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> Bah, I see it. This nasty '$' gets forgotten a lot, maybe we should have
> a check for that in some scripts :-)
>
> Here's the fix:
>
> ---
> Index: b/arch/x86/lib/thunk_64.S
> ===================================================================
> --- a/arch/x86/lib/thunk_64.S 2015-01-10 15:18:04.418737613 +0100
> +++ b/arch/x86/lib/thunk_64.S 2015-01-10 15:17:18.882736556 +0100
> @@ -67,7 +67,7 @@ restore:
> movq_cfi_restore 6*8, rdx
> movq_cfi_restore 7*8, rsi
> movq_cfi_restore 8*8, rdi
> - addq 9*8, %rsp
> + addq $9*8, %rsp
> CFI_ADJUST_CFA_OFFSET -9*8
> ret

Thanks!

After I've seen the disassembly I myself posted, I can't help but wonder
why we use 5-byte instructions to store and load regs on stack when
pushes and pops are 1 or 2-byte long.

Especially that 32-bit code *does* use push/pops.

Can you test the attached patch with your kvm guest testcase?
From 2f636e0a92db898f2bdb592027aa302fcb32a326 Mon Sep 17 00:00:00 2001
From: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
To: linux-kernel@xxxxxxxxxxxxxxx
Subject: [PATCH 3/4] x86: open-code register save/restore in trace_hardirqs thunks

This is a preparatory patch for change in "struct pt_regs"
handling in entry_64.S.

trace_hardirqs thunks were (ab)using a part of pt_regs
handling code, namely SAVE_ARGS/RESTORE_ARGS macros,
to save/restore registers across C function calls.

Since SAVE_ARGS is going to be changed, open-code
register saving/restoring here. Take a page from thunk_32.S
and use push/pop insns instead of movq, they are far shorter:
1 or 2 bytes versus 5, and no need for insns to adjust %rsp:

text data bss dec hex filename
333 40 0 373 175 thunk_64_movq.o
104 40 0 144 90 thunk_64_push_pop.o

Incidentally, this removes a bit of dead code:
one SAVE_ARGS was used just to emit a CFI annotation,
but it also generated unreachable assembly insns.

Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
CC: Oleg Nesterov <oleg@xxxxxxxxxx>
CC: "H. Peter Anvin" <hpa@xxxxxxxxx>
CC: Borislav Petkov <bp@xxxxxxxxx>
CC: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
CC: Frederic Weisbecker <fweisbec@xxxxxxxxx>
CC: X86 ML <x86@xxxxxxxxxx>
CC: Alexei Starovoitov <ast@xxxxxxxxxxxx>
CC: Will Drewry <wad@xxxxxxxxxxxx>
CC: Kees Cook <keescook@xxxxxxxxxxxx>
CC: linux-kernel@xxxxxxxxxxxxxxx
---
arch/x86/lib/thunk_64.S | 46 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/arch/x86/lib/thunk_64.S b/arch/x86/lib/thunk_64.S
index b30b5eb..8ec443a 100644
--- a/arch/x86/lib/thunk_64.S
+++ b/arch/x86/lib/thunk_64.S
@@ -17,9 +17,27 @@
CFI_STARTPROC

/* this one pushes 9 elems, the next one would be %rIP */
- SAVE_ARGS
+ pushq_cfi %rdi
+ CFI_REL_OFFSET rdi, 0
+ pushq_cfi %rsi
+ CFI_REL_OFFSET rsi, 0
+ pushq_cfi %rdx
+ CFI_REL_OFFSET rdx, 0
+ pushq_cfi %rcx
+ CFI_REL_OFFSET rcx, 0
+ pushq_cfi %rax
+ CFI_REL_OFFSET rax, 0
+ pushq_cfi %r8
+ CFI_REL_OFFSET r8, 0
+ pushq_cfi %r9
+ CFI_REL_OFFSET r9, 0
+ pushq_cfi %r10
+ CFI_REL_OFFSET r10, 0
+ pushq_cfi %r11
+ CFI_REL_OFFSET r11, 0

.if \put_ret_addr_in_rdi
+ /* 9*8(%rsp) is return addr on stack */
movq_cfi_restore 9*8, rdi
.endif

@@ -45,11 +63,31 @@
#endif
#endif

- /* SAVE_ARGS below is used only for the .cfi directives it contains. */
+#if defined(CONFIG_TRACE_IRQFLAGS) \
+ || defined(CONFIG_DEBUG_LOCK_ALLOC) \
+ || defined(CONFIG_PREEMPT)
CFI_STARTPROC
- SAVE_ARGS
+ CFI_ADJUST_CFA_OFFSET 9*8
restore:
- RESTORE_ARGS
+ popq_cfi %r11
+ CFI_RESTORE r11
+ popq_cfi %r10
+ CFI_RESTORE r10
+ popq_cfi %r9
+ CFI_RESTORE r9
+ popq_cfi %r8
+ CFI_RESTORE r8
+ popq_cfi %rax
+ CFI_RESTORE rax
+ popq_cfi %rcx
+ CFI_RESTORE rcx
+ popq_cfi %rdx
+ CFI_RESTORE rdx
+ popq_cfi %rsi
+ CFI_RESTORE rsi
+ popq_cfi %rdi
+ CFI_RESTORE rdi
ret
CFI_ENDPROC
_ASM_NOKPROBE(restore)
+#endif
--
1.8.1.4