Re: [RFC PATCH] x86: prevent gcc from emitting rep movsq/stosq for inlined ops

From: Andrew Cooper
Date: Wed Apr 02 2025 - 14:40:17 EST


On 02/04/2025 7:29 pm, Linus Torvalds wrote:
> On Wed, 2 Apr 2025 at 11:17, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>> Taking a leaf out of the repoline book, the ideal library call(s) would be:
>>
>> CALL __x86_thunk_rep_{mov,stos}sb
>>
>> using the REP ABI (parameters in %rcx/%rdi/etc), rather than the SYSV ABI.
> Yes. That's basically what 'rep_movs_alternative' does so that we can
> basically do a
>
> ALTERNATIVE("rep movsb",
> "call rep_movs_alternative",
> ALT_NOT(X86_FEATURE_FSRM))
>
> but we only do this for user space copies exactly because we don't
> have a good way to do it for compiler-generated ones.
>
> If gcc just did an out-of-line call, but used the 'rep movs' "calling
> convention", we would be able to basically do the rewriting
> dynamically, replacing the call with an inlined "rep movsb" where
> appropriate.

You still want the compiler to be able to do a first-pass optimisation
over __builtin_mem*(), for elimination/merging/etc, but if it could stop
half way through what it currently does and just emit the library call,
that would be excellent.

~Andrew