RE: [PATCH 2/8] lib/lzo: clean-up by introducing COPY16

From: David Laight
Date: Fri Nov 30 2018 - 07:27:55 EST


From: Dave Rodgman
> Sent: 30 November 2018 11:48
> From: Matt Sealey <matt.sealey@xxxxxxx>
>
> Most compilers should be able to merge adjacent loads/stores of sizes
> which are less than but effect a multiple of a machine word size (in
> effect a memcpy() of a constant amount). However the semantics of the
> macro are that it just does the copy, the pointer increment is in the
> code, hence we see
>
> *a = *b
> a += 8
> b += 8
> *a = *b
> a += 8
> b += 8
>
> This introduces a dependency between the two groups of statements which
> seems to defeat said compiler optimizers and generate some very strange
> sequences of addition and subtraction of address offsets (i.e. it is
> overcomplicated).
>
> Since COPY8 is only ever used to copy amounts of 16 bytes (in pairs),
> just define COPY16 as COPY8,COPY8. We leave the definition to preserve
> the need to do unaligned accesses to machine-sized words per the
> original code intent, we just don't use it in the code proper.
>
> COPY16 then gives us code like:
>
> *a = *b
> *(a+8) = *(b+8)
> a += 16
> b += 16

You probably actually want:
t1 = *b;
t2 = *(b+8);
*a = t1;
*(a+8) = t2;
a += 16;
b += 16;

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)