Re: [RFC 09/23] x86_64: Store userspace rsp in system_call fastpath

From: Frederic Weisbecker
Date: Thu Jun 28 2012 - 08:08:50 EST


On Tue, Jun 19, 2012 at 05:48:00PM +0200, Jiri Olsa wrote:
> hi,
> I'd need help with this change.. basically it works, but I guess
> I could alter the FIXUP_TOP_OF_STACK macro as well, so the rsp is
> not initialized twice in slowpath.
>
> But it seems quite complex.. so not really sure at the moment ;)
>
> ideas?
>
> thanks,
> jirka
>
> ---
> Storing the userspace rsp into the pt_regs struct for the
> system_call fastpath (syscall instruction handler).
>
> Following part of the pt_regs is allocated on stack:
> (via KERNEL_STACK_OFFSET)
>
> unsigned long ip;
> unsigned long cs;
> unsigned long flags;
> unsigned long sp;
> unsigned long ss;
>
> but only ip is actually saved for fastpath.
>
> For perf post unwind we need at least ip and sp to be able to
> start the unwind, so storing the old_rsp value to the sp.
>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: H. Peter Anvin <hpa@xxxxxxxxx>
> Cc: Roland McGrath <roland@xxxxxxxxxxxxx>
> ---
> arch/x86/kernel/entry_64.S | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
> index 111f6bb..0444917 100644
> --- a/arch/x86/kernel/entry_64.S
> +++ b/arch/x86/kernel/entry_64.S
> @@ -516,6 +516,11 @@ GLOBAL(system_call_after_swapgs)
> SAVE_ARGS 8,0
> movq %rax,ORIG_RAX-ARGOFFSET(%rsp)
> movq %rcx,RIP-ARGOFFSET(%rsp)
> +#ifdef CONFIG_PERF_EVENTS
> + /* We need rsp in fast path for perf post unwind. */
> + movq PER_CPU_VAR(old_rsp), %rcx
> + movq %rcx,RSP-ARGOFFSET(%rsp)
> +#endif

Another solution is to set/unset some TIF flag in perf_event_sched_in/out
such that we take the syscall slow path (tracesys) which records every non-scratch
registers. I can see that old_rsp is not saved in pt_regs by SAVE_REST so this may be
something to add in tracesys.

This way we don't bloat the syscall fastpath with a feature only used by some
developers.

> CFI_REL_OFFSET rip,RIP-ARGOFFSET
> testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
> jnz tracesys
> --
> 1.7.7.6
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/