Re: [PATCH 3/4] x86: Rewrite switch_to() code

From: Andy Lutomirski
Date: Sun May 22 2016 - 14:00:19 EST


cc: Josh Poimboeuf: do you care about the exact stack layout of the
bottom of the stack of an inactive task?

On May 21, 2016 9:05 AM, "Brian Gerst" <brgerst@xxxxxxxxx> wrote:
>
> Move the low-level context switch code to an out-of-line asm stub instead of
> using complex inline asm. This allows constructing a new stack frame for the
> child process to make it seamlessly flow to ret_from_fork without an extra
> test and branch in __switch_to(). It also improves code generation for
> __schedule() by using the C calling convention instead of clobbering all
> registers.

I like the concept a lot.

>
> Signed-off-by: Brian Gerst <brgerst@xxxxxxxxx>
> ---
> arch/x86/entry/entry_32.S | 38 ++++++++++
> arch/x86/entry/entry_64.S | 42 +++++++++++-
> arch/x86/include/asm/processor.h | 3 -
> arch/x86/include/asm/switch_to.h | 137 ++++++-------------------------------
> arch/x86/include/asm/thread_info.h | 2 -
> arch/x86/kernel/asm-offsets.c | 6 ++
> arch/x86/kernel/asm-offsets_32.c | 5 ++
> arch/x86/kernel/asm-offsets_64.c | 5 ++
> arch/x86/kernel/process_32.c | 8 ++-
> arch/x86/kernel/process_64.c | 7 +-
> arch/x86/kernel/smpboot.c | 1 -
> 11 files changed, 124 insertions(+), 130 deletions(-)
>
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index ee6fea0..05e5340 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -204,6 +204,44 @@
> POP_GS_EX
> .endm
>
> +/*
> + * %eax: prev task
> + * %edx: next task
> + */
> +ENTRY(__switch_to_asm)
> + /*
> + * Save callee-saved registers
> + * This must match the order in struct fork_frame
> + * Frame pointer must be last for get_wchan
> + */
> + pushl %ebx
> + pushl %edi
> + pushl %esi
> + pushl %ebp
> +
> + /* switch stack */
> + movl %esp, TASK_threadsp(%eax)
> + movl TASK_threadsp(%edx), %esp
> +
> +#ifdef CONFIG_CC_STACKPROTECTOR
> + movl TASK_stack_canary(%edx), %ebx
> + movl %ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset
> +#endif
> +
> + /* restore callee-saved registers */
> + popl %ebp
> + popl %esi
> + popl %edi
> + popl %ebx

This is highly, highly magical. eax and edx are prev and next, and:

> +
> + jmp __switch_to

leaves prev in eax. This works, but it might be worth a comment.

> +END(__switch_to_asm)

> /*
> + * %rdi: prev task
> + * %rsi: next task
> + */
> +ENTRY(__switch_to_asm)
> + /*
> + * Save callee-saved registers
> + * This must match the order in struct fork_frame
> + * Frame pointer must be last for get_wchan
> + */
> + pushq %rbx
> + pushq %r12
> + pushq %r13
> + pushq %r14
> + pushq %r15
> + pushq %rbp
> +
> + /* switch stack */
> + movq %rsp, TASK_threadsp(%rdi)
> + movq TASK_threadsp(%rsi), %rsp
> +
> +#ifdef CONFIG_CC_STACKPROTECTOR
> + movq TASK_stack_canary(%rsi), %rbx
> + movq %rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset
> +#endif
> +
> + /* restore callee-saved registers */
> + popq %rbp
> + popq %r15
> + popq %r14
> + popq %r13
> + popq %r12
> + popq %rbx
> +
> + jmp __switch_to

Ditto with the magic here.

> +struct fork_frame {
> + unsigned long bp;
> +#ifdef CONFIG_X86_64
> + unsigned long r15;
> + unsigned long r14;
> + unsigned long r13;
> + unsigned long r12;
> +#else
> + unsigned long si;
> + unsigned long di;
> +#endif
> + unsigned long bx;
> + unsigned long ret_addr;
> + struct pt_regs regs;
> +};

This, like the old implementation, is very much geared to the current
implementation of fork. Can you split it up:

struct inactive_task_frame {
unsigned long bp;
...
unsigned long ret_addr;
};

/* fork works by setting up the child stack so that switch_to will
land at ret_from_fork with sp pointing at pt_regs */
struct fork_frame {
struct inactive_task_frame switch_frame;
struct pt_regs regs;
};

Then, if and when someone wants to fork into a different type of
context, they can reuse this. Also, a future improved unwinder can
use inactive_task_frame directly to kick off its unwind.

--Andy