Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

From: Joe Perches
Date: Fri Sep 22 2017 - 21:33:55 EST


On Fri, 2017-09-22 at 21:17 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@xxxxxxxxxxx> wrote:
> > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > > text data bss dec hex filename
> > > 18220 176 0 18396 47dc build/tmp/lib/lz4/lz4_decompress-after.o
> > > 22297 0 0 22297 5719 build/tmp/lib/lz4/lz4_decompress-before.o
> >
> > Perhaps not so much a gcc bug as an opportunity
> > for gcc to add an additional optimization.
> >
> > gcc would have to verify that the const array is
> > not initialized with some variable or argument like:
> >
> > int foo(int a)
> > {
> > const int array[] = {1, a};
> > ...
> > }
>
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

Maybe/maybe not.
> I just cross-checked by building with clang, there the patch has
> no impact on code size, it is 24929 bytes with or without the patch.
>
> Looking at other versions of (x86) gcc, I see .text sizes of
>
> after before
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724

Interesting this was apparently deoptimized at version 4.3.

Glancing at the release notes doesn't seem to indicate
anything obvious.

https://gcc.gnu.org/gcc-4.3/changes.html

> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065
>
> It seems whatever happened in early versions of gcc-7 has since
> improved, and it probably was a bug since older and newer versions
> create similar code size (I have not looked at the actual object code).
>
> The 5K difference in gcc-5 and higher still seems like a lot. It would
> also be interesting to look at the decompression performance of
> this code witth the different compilers to see if it got better or worse.

yup

> Most likely, gcc got better at inlining and unrolling parts of the
> algorithm, but sometimes an object file that doubles or triples in
> size is an indication that the compiler did something really bad.

yup[2]

cheers, Joe