Re: [RFC] Improve memset

From: Alexey Dobriyan
Date: Sat Sep 14 2019 - 05:29:23 EST


> Instead of calling memset:
>
> ffffffff8100cd8d: e8 0e 15 7a 00 callq ffffffff817ae2a0 <__memset>
>
> and having a JMP inside it depending on the feature supported, let's simply
> have the REP; STOSB directly in the code:
>
> ...
> ffffffff81000442: 4c 89 d7 mov %r10,%rdi
> ffffffff81000445: b9 00 10 00 00 mov $0x1000,%ecx
>
> <---- new memset
> ffffffff8100044a: f3 aa rep stos %al,%es:(%rdi)
> ffffffff8100044c: 90 nop
> ffffffff8100044d: 90 nop
> ffffffff8100044e: 90 nop

You can fit entire "xor eax, eax; rep stosb" inside call instruction.

> /* clobbers used by memset_orig() and memset_rep_good() */
> : "rsi", "rdx", "r8", "r9", "memory");

eh... I'd just drop it. These registers screw up everything.

Time to rebase memset0().