kernel: Current status of CONFIG_CC_OPTIMIZE_FOR_SIZE=y (was: Re: [PATCH -tip] x86/locking/atomic: Use asm_inline for atomic locking insns)
From: Ingo Molnar
Date: Thu Mar 06 2025 - 04:43:49 EST
* Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> On Wed, Mar 5, 2025 at 10:26 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Mar 05, 2025 at 09:36:33PM +0100, Borislav Petkov wrote:
> > > On Wed, Mar 05, 2025 at 09:54:11AM +0100, Uros Bizjak wrote:
> > > > The -Os argument was to show the effect of the patch when the compiler
> > > > is instructed to take care of the overall size. Giving the compiler
> > > > -O2 and then looking at the overall size of the produced binary is
> > > > just wrong.
> > >
> > > No one cares about -Os AFAICT. It might as well be non-existent. So the effect
> > > doesn't matter.
> >
> > Well, more people would care if it didn't stand for -Ostupid I suppose.
> > That is, traditionally GCC made some very questionable choices with -Os,
> > quite horrendous code-gen.
>
> Size optimizations result in 15% code size reduction (x86_64
> defconfig, gcc-14.2), so they reflect what user wanted:
>
> text data bss dec hex filename
> 27478996 4635807 814660 32929463 1f676b7 vmlinux-O2.o
> 23859143 4617419 814724 29291286 1bef316 vmlinux-Os.o
>
> The compiler heuristics depend on tradeoffs, and -Os uses different
> tradeoffs than -O2. Unfortunately, there is no
> -Os-but-I-really-want-performace switch, but OTOH, tradeoffs can be
> adjusted. The compiler is open-source, and these adjustments can be
> discussed in public spaces (mailing lists and bugzilla) and eventually
> re-tuned. We are aware that the world around us changes, so tunings
> are not set in stone, but we also depend on user feedback.
So the best way to drive -Os forward is not to insist that it's good
(it might still be crap), and not to insist that it's crap (it might
have become better), but to dig out old problems and to look at what
kind of code current compilers generate in the kernel with -Os.
There's been a few pathological GCC optimizations in the past, but also
other problems, such as this one 9 years ago that hid useful warnings:
=================>
877417e6ffb9 Kbuild: change CC_OPTIMIZE_FOR_SIZE definition
=================>
From: Arnd Bergmann <arnd@xxxxxxxx>
Date: Mon, 25 Apr 2016 17:35:27 +0200
Subject: [PATCH] Kbuild: change CC_OPTIMIZE_FOR_SIZE definition
CC_OPTIMIZE_FOR_SIZE disables the often useful -Wmaybe-unused warning,
because that causes a ridiculous amount of false positives when combined
with -Os.
This means a lot of warnings don't show up in testing by the developers
that should see them with an 'allmodconfig' kernel that has
CC_OPTIMIZE_FOR_SIZE enabled, but only later in randconfig builds
that don't.
And this one by Linus, 14 years ago:
=================>
281dc5c5ec0f ("Give up on pushing CC_OPTIMIZE_FOR_SIZE")
=================>
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Sun, 22 May 2011 14:30:36 -0700
Subject: [PATCH] Give up on pushing CC_OPTIMIZE_FOR_SIZE
I still happen to believe that I$ miss costs are a major thing, but
sadly, -Os doesn't seem to be the solution. With or without it, gcc
will miss some obvious code size improvements, and with it enabled gcc
will sometimes make choices that aren't good even with high I$ miss
ratios.
For example, with -Os, gcc on x86 will turn a 20-byte constant memcpy
into a "rep movsl". While I sincerely hope that x86 CPU's will some day
do a good job at that, they certainly don't do it yet, and the cost is
higher than a L1 I$ miss would be.
Some day I hope we can re-enable this.
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
I'm quite sure there were more, but these were that popped up in a
quick Git archeology.
And yeah, it was me who pushed for -Os originally 17 years ago, due to
the positive I$ impact, in theory:
=================>
96fffeb4b413 ("make CC_OPTIMIZE_FOR_SIZE non-experimental")
=================>
From: Ingo Molnar <mingo@xxxxxxx>
Date: Mon, 28 Apr 2008 01:39:43 +0200
Subject: [PATCH] make CC_OPTIMIZE_FOR_SIZE non-experimental
this option has been the default on a wide range of distributions
for a long time - time to make it non-experimental.
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
But practice disagreed with theory, and obviously in the kernel
practice has supremacy.
But yes, I'd cautiously agree that reduced kernel size with a -Os build
is a stochastic proxy metric for better code and better performance -
but it comes with caveats and needs to be backed by other data or
robust first principles arguments too.
Thanks,
Ingo