Re: [discuss] [PATCH] x86-64: memset optimization

From: Jan Hubicka
Date: Mon Aug 20 2007 - 14:56:49 EST


> > > The problem is with the optimization flags: passing -Os causes the compiler
> > > to be stupid and not inline any memset/memcpy functions.
> >
> > you get what you ask for.. if you don't want that then don't ask for
> > it ;)
>
> Well, the compiler is really being dumb about -Os and in fact it's
> giving bigger code, so I'm not really getting what I ask for.
>
> With my gcc at least (x86_64, gcc (GCC) 4.1.3 20070812 (prerelease)
> (Ubuntu 4.1.2-15ubuntu2)) and Andi's example:
>
> #include <string.h>
>
> f(char x[6]) {
> memset(x, 1, 6);
> }
>
> compiling with -O2 gives
>
> 0000000000000000 <f>:
> 0: c7 07 01 01 01 01 movl $0x1010101,(%rdi)
> 6: 66 c7 47 04 01 01 movw $0x101,0x4(%rdi)
> c: c3 retq

GCC mainline (ie future GCC4.3.0) now give:
0000000000000000 <f>:
0: b0 01 mov $0x1,%al
2: b9 06 00 00 00 mov $0x6,%ecx
7: f3 aa rep stos %al,%es:(%rdi)
9: c3 retq
That is smallest, definitly not fastest.
GCC up to 4.3.0 won't be able to inline memset with non-0 operand...

Honza
>
> and compiling with -Os gives
>
> 0000000000000000 <f>:
> 0: 48 83 ec 08 sub $0x8,%rsp
> 4: ba 06 00 00 00 mov $0x6,%edx
> 9: be 01 00 00 00 mov $0x1,%esi
> e: e8 00 00 00 00 callq 13 <f+0x13>
> 13: 5a pop %rdx
> 14: c3 retq
>
> so the code gets bigger and worse in every way.
>
> - R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/