Re: new execve/kernel_thread design

From: Al Viro
Date: Fri Oct 19 2012 - 13:30:24 EST

On Fri, Oct 19, 2012 at 05:16:50PM +0000, Luck, Tony wrote:
> > Surprisingly enough, ia64 one seems to work on actual hardware; I have sent
> > Tony an incremental patch cleaning copy_thread() up, waiting for results of
> > testing that on SMP box.
> Tiny bit faster than plain 3.7-rc1. lmbench3 reports fork+execve test at between
> 558 to 567 usec with the new code, compared with 562-572 usec with the old.

Are you OK with the state of comments in call_payload() in the current
form of that sucker? Right now in #arch-ia64 is looks so:
/* call the kernel_thread payload; fn is in r4, arg - in r5 */
alloc loc1=ar.pfs,0,3,1,0
mov loc0=rp
mov loc2=gp
mov out0=r5 // arg
ld8 r14 = [r4], 8 // fn.address
mov b6 = r14
ld8 gp = [r4] //
;; rp=b6 // fn(arg)
.ret12: mov gp=loc2
mov rp=loc0
mov ar.pfs=loc1
/* ... and if it has returned, we are going to userland */ pKStk,pUStk=r0,r0
br.ret.sptk.many rp

IIRC, the lack of comments on function with unusual calling conventions was
the last remaining issue...
