Re: [PATCH] enforce function inlining for hot functions

From: Paul E. McKenney
Date: Sat Apr 25 2015 - 09:51:50 EST

On Sat, Apr 25, 2015 at 03:26:48PM +0200, Hagen Paul Pfeifer wrote:
> On 25 April 2015 at 12:31, Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > I am not arguing either way on the wisdom or lack thereof of gcc's
> > inlining decisions. But PROVE_RCU=n and CONFIG_DEBUG_LOCK_ALLOC=n should
> > make rcu_read_lock() and rcu_read_unlock() both be empty functions in
> > a CONFIG_PREEMPT=n, which should hopefully trivialize gcc's inlining
> > decisions in that particular case.
> Hey Paul,
> yes, with DEBUG_LOCK_ALLOC disabled all rcu_read_lock and unlock
> functions are perfectly inlined.

Whew!!! ;-)

> So now we have the following
> situation: depending on the gcc version and the particular kernel
> configuration some hot functions are not inlined - they are duplicated
> hundred times. Which is bad no matter how you consider
> gcc/kernel-configuration. I think this should *never* happened.
> With the patch we can make sure that hot functions are *always*
> inlined - no matter what gcc version and kernel configuration is used.
> Furthermore, as Markus already noted: compiled with -O2 this do not
> happened. Duplicates are only generated for -Os[1]. Ok, but now the
> question: should this happened for Os? I don't think so. I think we
> can do it better and mark these few functions as always inline. For
> the remaining inlined marked function we should provide gcc the
> flexibility and do not artificially enforce inlining. The current
> situation is bad: OPTIMIZE_INLINING is default no, which defacto
> enforces inlining for *all* inlined marked functions. GCC inlining
> mechanism is defacto disabled, which is also bad. Last but not least:
> the patch do not change anything for the current user, because we will
> still disable OPTIMIZE_INLINING (resulting in __always_inline for all
> inlined marked functions). The patch effects users who enable
> OPTIMIZE_INLINING and trust the compiler.
> Hagen
> PS: thank you Markus for the comment.
> [1] which is nonsense: the functions are not inlined yet, but are
> copied hundred times for "size optimized builds". gcc should rather
> redeclare the functions global, define it one time and call this
> function every time. But implementing such a scheme is probably a
> monster of itself and LTO is required so solve all issues with such a
> concept.

I am guessing that there is only one duplicate per compilation unit?
I would also guess that the LTO guys would have a ready solution. ;-)

That said, if a function was invoked extremely many times, it might
make sense to duplicate it even within a single compilation unit if
doing so allowed saving more than the size of the function in the
form of call instructions with shorter address fields. But I have
no idea whether or not gcc would do this sort of thing.

Thanx, Paul

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at