RE: [PATCH v4] x86, mem: move memmove to out of line assembler
From: David Laight
Date: Fri Sep 30 2022 - 06:14:39 EST
From: Nick Desaulniers
> Sent: 28 September 2022 22:05
...
Reading it again, what is this test supposed to achieve?
> + /*
> + * movs instruction is only good for aligned case.
> + */
> + movl src, tmp0
> + xorl dest, tmp0
> + andl $0xff, tmp0
> + jz .Lforward_movs
The 'aligned' test would be '(src | dest) & 3'.
(Or maybe '& 7' since some 32bit x86 cpu actally
do 8 byte aligned 'rep movsl' faster than 4 byte
aligned ones.
OTOH the code loop is likely to be slower still.
I've not tried measuring misaligned 'rep movsw' but
on some recent intel cpu normal misaligned reads cost
almost nothing - even when doing two reads/clock.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)