Re: -Os versus -O2

From: Segher Boessenkool
Date: Mon Jun 25 2007 - 04:42:00 EST


In my experience, -Os produced faster code on gcc-2.95 than -O2 or -O3.

On what CPU? The effect of different optimisations varies
hugely between different CPUs (and architectures).

x86

That's not a CPU, that's an architecture. I hope you
understand there are very big differences between different
members of the x86 family and you don't compare 2.95 on a
Pentium class CPU to 3.x on an Opteron or 4.x on a Pentium4
or something like that.

With gcc-3.3, -Os show roughly the same performance as -O2 for me on
various programs. However, with gcc-3.4, I noticed a slow down with
-Os. And with gcc-4, using -Os optimizes only for size, even if the
output code is slow as hell. I've had programs whose speed dropped
by 70% using -Os on gcc-4.

Well you better report those! <http://gcc.gnu.org/bugzilla>

No, -Os is for size only :

-Os Optimize for size. -Os enables all -O2 optimizations
that do not typically increase code size. It also
performs further optimizations designed to reduce code
size.

That is not "for size only". Please read again.

A 70% speed decrease is something that should be at least
investigated, even if then perhaps it is decided GCC already
does the "right thing".

So it is expected that speed can be reduced using -Os. I won't report
a thing which is already documented !

A few percent points slower is expected, 20% would be
explainable, but 70%?

-O2 and -Os are supposed to differ in _minor_ ways. Such
a huge performance drop is unexpected. If you file the PR,
feel free to blame me for reporting it at all.

But in some situtations, it's desirable to have the smallest possible
kernel whatever its performance. This goes for installation CDs for
instance.

There are much better ways to achieve that.

Optimizing is not a matter of choosing *one* way, but cumulating
everything you have.

Yes of course. I'm just saying -Os is a pretty minor step
in the overall making-things-smaller game. Leaving out XFS
helps a whole megabyte on my default target, for example.

For instance, on a smart boot loader, I have
a kernel which is about 300 kB, or 700 kB with the initramfs. Among
the tricks I used :
- -Os
- -march=i386
- align everything to 0
- replace gzip with p7zip

Even if each of them reduces overall size by 5%, the net result is
0.95^4 = 0.81 = 19% gain, for the same set of features. This is
something to consider.

Sure. I don't think making -Os mean "as small as possible
in all cases" (or, rather, introducing a new option for that)
would help terribly much over the current -Os meaning -- a
few percent at most. That's not to say that no such optimisations
are added anymore, but mostly they turn out not to decrease
speed at all and so are enabled at any -O level :-)


Segher

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/