[PATCH] x86: Return to kernel without IRET

From: Andy Lutomirski
Date: Fri May 02 2014 - 19:52:20 EST


On my box, this saves about 100ns on each interrupt and trap that
happens while running in kernel space. This speeds up my kernel_pf
microbenchmark by about 17%.

Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
---

Changes from the RFC:
- Much better comments
- Rewritten to use popq_cfi directly instead of RESTORE_ARGS
- Uses sti to restore IF so we get the interrupt shadow

arch/x86/kernel/entry_64.S | 51 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 1e96c36..0f6fe36 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1023,7 +1023,7 @@ retint_check:

retint_swapgs: /* return to user-space */
/*
- * The iretq could re-enable interrupts:
+ * The sti could re-enable interrupts:
*/
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_IRETQ
@@ -1033,9 +1033,56 @@ retint_swapgs: /* return to user-space */
retint_restore_args: /* return to kernel space */
DISABLE_INTERRUPTS(CLBR_ANY)
/*
- * The iretq could re-enable interrupts:
+ * The popfq could re-enable interrupts:
*/
TRACE_IRQS_IRETQ
+
+ /*
+ * Fast return to kernel. The stack looks like:
+ *
+ * previous frame
+ * possible 8 byte gap for alignment
+ * SS RSP EFLAGS CS RIP
+ * ORIG_RAX RDI ... R11
+ *
+ * We rewrite it to:
+ *
+ * previous frame
+ * RIP (EFLAGS & ~IF) ...
+ * pointer to the EFLAGS slot
+ * RDI ... R11
+ */
+ movq RSP-ARGOFFSET(%rsp), %rsi
+ subq $16, %rsi
+ movq EFLAGS-ARGOFFSET(%rsp), %rdi
+ movq RIP-ARGOFFSET(%rsp), %rcx
+ btr $9, %rdi
+ movq %rdi, (%rsi)
+ movq %rcx, 8(%rsi)
+ movq %rsi, ORIG_RAX-ARGOFFSET(%rsp)
+ popq_cfi %r11
+ popq_cfi %r10
+ popq_cfi %r9
+ popq_cfi %r8
+ popq_cfi %rax
+ popq_cfi %rcx
+ popq_cfi %rdx
+ popq_cfi %rsi
+ popq_cfi %rdi
+
+ popq %rsp
+ jc 1f
+ /* Interrupts were not enabled */
+ popfq_cfi
+ retq
+1:
+ CFI_ADJUST_CFA_OFFSET 8
+ /* Interrupts were enabled */
+ popfq_cfi
+ sti
+ /* Interrupts are still off because of the one-insn grace period. */
+ retq
+
restore_args:
RESTORE_ARGS 1,8,1

--
1.9.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/