Re: x86 copy performance regression

From: Eric Dumazet
Date: Fri May 26 2023 - 13:25:44 EST


On Fri, May 26, 2023 at 7:17 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, May 26, 2023 at 10:00 AM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Let me go look at it some more. I *really* didn't want to make the
> > code worse for ERMS
>
> Oh well. I'll think about it some more in the hope that I can come up
> with something clever that doesn't make objtool hate me, but in the
> meantime let me just give you the "not clever" patch.
>
> It generates an annoying six-byte jump when the small 2-byte one would
> work just fine, but I guess only my pride is wounded.

arch/x86/lib/copy_user_64.S:34:2: error: invalid instruction mnemonic
'alternative'
alternative "jae .Lunrolled", "jae .Llarge", ( 9*32+ 9)
^~~~~~~~~~~

I changed alternative to ALTERNATIVE to let it build.

SYM_FUNC_START(rep_movs_alternative)
cmpq $64,%rcx
- jae .Lunrolled
+ ALTERNATIVE "jae .Lunrolled", "jae .Llarge", X86_FEATURE_ERMS

I will report test result soon, thanks !