Re: Proposal for finishing the 64-bit x86 syscall cleanup

From: Ingo Molnar
Date: Tue Aug 25 2015 - 04:42:45 EST

* Ingo Molnar <mingo@xxxxxxxxxx> wrote:

> * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > Hi all-
> >
> > I want to (try to) mostly or fully get rid of the messy bits (as
> > opposed to the hardware-bs-forced bits) of the 64-bit syscall asm.
> > There are two major conceptual things that are in the way.
> >
> > Thing 1: partial pt_regs
> >
> > 64-bit fast path syscalls don't fully initialize pt_regs: bx, bp, and
> > r12-r15 are uninitialized. Some syscalls require them to be
> > initialized, and they have special awful stubs to do it. The entry
> > and exit tracing code (except for phase1 tracing) also need them
> > initialized, and they have their own messy initialization. Compat
> > syscalls are their own private little mess here.
> >
> > This gets in the way of all kinds of cleanups, because C code can't
> > switch between the full and partial pt_regs states.
> >
> > I can see two ways out. We could remove the optimization entirely,
> > which consists of pushing and popping six more registers and adds
> > about ten cycles to fast path syscalls on Sandy Bridge. It also
> > simplifies and presumably speeds up the slow paths.
> So out of hundreds of regular system calls there's only a handful of such system
> calls:
> triton:~/tip> git grep stub arch/x86/entry/syscalls/
> arch/x86/entry/syscalls/syscall_32.tbl:2 i386 fork sys_fork stub32_fork
> arch/x86/entry/syscalls/syscall_32.tbl:11 i386 execve sys_execve stub32_execve
> arch/x86/entry/syscalls/syscall_32.tbl:119 i386 sigreturn sys_sigreturn stub32_sigreturn
> arch/x86/entry/syscalls/syscall_32.tbl:120 i386 clone sys_clone stub32_clone
> arch/x86/entry/syscalls/syscall_32.tbl:173 i386 rt_sigreturn sys_rt_sigreturn stub32_rt_sigreturn
> arch/x86/entry/syscalls/syscall_32.tbl:190 i386 vfork sys_vfork stub32_vfork
> arch/x86/entry/syscalls/syscall_32.tbl:358 i386 execveat sys_execveat stub32_execveat
> arch/x86/entry/syscalls/syscall_64.tbl:15 64 rt_sigreturn stub_rt_sigreturn
> arch/x86/entry/syscalls/syscall_64.tbl:56 common clone stub_clone
> arch/x86/entry/syscalls/syscall_64.tbl:57 common fork stub_fork
> arch/x86/entry/syscalls/syscall_64.tbl:58 common vfork stub_vfork
> arch/x86/entry/syscalls/syscall_64.tbl:59 64 execve stub_execve
> arch/x86/entry/syscalls/syscall_64.tbl:322 64 execveat stub_execveat
> arch/x86/entry/syscalls/syscall_64.tbl:513 x32 rt_sigreturn stub_x32_rt_sigreturn
> arch/x86/entry/syscalls/syscall_64.tbl:520 x32 execve stub_x32_execve
> arch/x86/entry/syscalls/syscall_64.tbl:545 x32 execveat stub_x32_execveat
> and none of them are super performance critical system calls, so no way would I
> go for unconditionally saving/restoring all of ptregs, just to make it a bit
> simpler for these syscalls.

Let me qualify that: no way in the long run.

In the short run we can drop the optimization and reintroduce it later, to lower
all the risks that the C conversion brings with itself.

( That would also make it easier to re-analyze the cost/benefit ratio of the
optimization. )

So feel free to introduce a simple ptregs save/restore pattern for now.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at