Re: [RFC PATCH] x86: optimize IRET returns to kernel

From: Steven Rostedt
Date: Tue Mar 31 2015 - 09:49:45 EST


On Tue, 31 Mar 2015 14:46:21 +0200
Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:

> @@ -750,6 +750,53 @@ retint_kernel:
> * The iretq could re-enable interrupts:
> */
> TRACE_IRQS_IRETQ
> +
> + /*
> + * Since we return to kernel, CS and SS do not need changing.
> + * Only RSP, RIP and RFLAGS do.
> + * We can use POPF + near RET, which is much faster.
> + * The below code may seem excessive, but IRET is _very_ slow.
> + * Hundreds of cycles.
> + *
> + * However, there is a complication. Interrupts in 64-bit mode
> + * align stack to 16 bytes. This changes location
> + * where we need to store EFLAGS and RIP:
> + */
> +#if 0
> + testb $8, RSP(%rsp)

Shouldn't this be: testb $0xf, RSP(%rsp) ?

The stack should be word (8 bytes) aligned, but I never like to assume
anything. As an interrupt can come in anywhere, and %rsp can be
modified to anything in assembly, I wouldn't want some "hack" that
performs a non word aligned rsp manipulation to suddenly break. It
would be rather hard to debug.


> + jnz 1f
> +#else
> + /* There is a complication #2: 64-bit mode has IST stacks */
> + leaq SIZEOF_PTREGS+8(%rsp), %rax
> + cmpq %rax, RSP(%rsp)
> + je 1f
> + subq $8, %rax
> + cmpq %rax, RSP(%rsp)
> + jne restore_args /* probably IST stack, can't optimize */
> +#endif
> + /* there is no padding above iret frame */
> + movq EFLAGS(%rsp), %rax
> + movq RIP(%rsp), %rcx
> + movq %rax, (SIZEOF_PTREGS-2*8)(%rsp)
> + movq %rcx, (SIZEOF_PTREGS-1*8)(%rsp)
> + CFI_REMEMBER_STATE
> + RESTORE_C_REGS
> + REMOVE_PT_GPREGS_FROM_STACK 4*8 /* remove all except last two words */
> + popfq_cfi
> + retq

BTW, have you made sure that this path has been hit?

-- Steve

> + CFI_RESTORE_STATE
> +1: /* there are 8 bytes of padding above iret frame */
> + movq EFLAGS(%rsp), %rax
> + movq RIP(%rsp), %rcx
> + movq %rax, (SIZEOF_PTREGS-2*8 + 8)(%rsp)
> + movq %rcx, (SIZEOF_PTREGS-1*8 + 8)(%rsp)
> + CFI_REMEMBER_STATE
> + RESTORE_C_REGS
> + REMOVE_PT_GPREGS_FROM_STACK 4*8 + 8
> + popfq_cfi
> + retq
> + CFI_RESTORE_STATE
> +
> restore_args:
> RESTORE_C_REGS
> REMOVE_PT_GPREGS_FROM_STACK 8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/