Re: [PATCH 2/2] x86/asm/entry/32: Remove unnecessary optimization in stub32_clone

From: Josh Triplett
Date: Wed Apr 22 2015 - 13:10:29 EST


On Wed, Apr 22, 2015 at 09:54:24AM -0700, Andy Lutomirski wrote:
> On Wed, Apr 22, 2015 at 9:40 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> > Really swap arguments #4 and #5 in stub32_clone instead of "optimizing"
> > it into a move.
> >
> > Yes, tls_val is currently unused. Yes, on some CPUs XCHG is a little bit
> > more expensive than MOV. But a cycle or two on an expensive syscall like
> > clone() is way below noise floor, and obfuscation of logic introduced
> > by this optimization is simply not worth it.
>
> Ditto re: Josh's patch.

I do think my two-patch HAVE_COPY_THREAD_TLS series should go in fixing
this, but I'd like to see the final version of Denys' comment added on
top of it (with an update to the type and name of the tls argument to
match the changes to sys_clone).

Denys, would you consider submitting a patch adding your comment on top
of the two-patch series I just sent?

Thanks,
Josh Triplett

> --Andy
>
> >
> > Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
> > CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > CC: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > CC: Ingo Molnar <mingo@xxxxxxxxxx>
> > CC: Borislav Petkov <bp@xxxxxxxxx>
> > CC: "H. Peter Anvin" <hpa@xxxxxxxxx>
> > CC: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> > CC: Oleg Nesterov <oleg@xxxxxxxxxx>
> > CC: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> > CC: Alexei Starovoitov <ast@xxxxxxxxxxxx>
> > CC: Will Drewry <wad@xxxxxxxxxxxx>
> > CC: Kees Cook <keescook@xxxxxxxxxxxx>
> > CC: x86@xxxxxxxxxx
> > CC: linux-kernel@xxxxxxxxxxxxxxx
> > ---
> > arch/x86/ia32/ia32entry.S | 6 ++----
> > 1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
> > index 8e72256..0c302d0 100644
> > --- a/arch/x86/ia32/ia32entry.S
> > +++ b/arch/x86/ia32/ia32entry.S
> > @@ -567,11 +567,9 @@ GLOBAL(stub32_clone)
> > * 32-bit clone API is clone(..., int tls_val, int *child_tidptr).
> > * 64-bit clone API is clone(..., int *child_tidptr, int tls_val).
> > * Native 64-bit kernel's sys_clone() implements the latter.
> > - * We need to swap args here. But since tls_val is in fact ignored
> > - * by sys_clone(), we can get away with an assignment
> > - * (arg4 = arg5) instead of a full swap:
> > + * We need to swap args here:
> > */
> > - mov %r8, %rcx
> > + xchg %r8, %rcx
> > jmp ia32_ptregs_common
> >
> > ALIGN
> > --
> > 1.8.1.4
> >
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/