Re: [PATCH] force inlining of spinlock ops

From: Andrew Morton
Date: Mon May 11 2015 - 18:19:23 EST

On Mon, 11 May 2015 19:57:22 +0200 Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:

> With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
> very small functions we expect to be inlined. In particular,
> with this config:
> there are more than a thousand copies of tiny spinlock-related functions:
> $ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^ *1 ' | sort -rn | grep ' spin'
> 473 000000000000000b t spin_unlock_irqrestore
> 292 000000000000000b t spin_unlock
> 215 000000000000000b t spin_lock
> 134 000000000000000b t spin_unlock_irq
> 130 000000000000000b t spin_unlock_bh
> 120 000000000000000b t spin_lock_irq
> 106 000000000000000b t spin_lock_bh
> Disassembly:
> ffffffff81004720 <spin_lock>:
> ffffffff81004720: 55 push %rbp
> ffffffff81004721: 48 89 e5 mov %rsp,%rbp
> ffffffff81004724: e8 f8 4e e2 02 callq <_raw_spin_lock>
> ffffffff81004729: 5d pop %rbp
> ffffffff8100472a: c3 retq
> This patch fixes this via s/inline/__always_inline/ in spinlock.h.
> This decreases vmlinux by about 30k:
> text data bss dec hex filename
> 82375570 22255544 20627456 125258570 7774b4a vmlinux.before
> 82335059 22255416 20627456 125217931 776ac8b vmlinux

See also ("enforce function
inlining for hot functions").

Presumably Hagen didn't see the issue with spinlock functions. I
wonder why not.

I suppose we should get both these consolidated into a coherent whole.

It's a bit irritating to have to do this: presumably gcc will get fixed
and the huge sprinkling of __always_inline will become less and less
relevant over time and people will have trouble distinguishing "real
__always_inline which was put here for a purpose" from "dopey
__always_inline to work around a short-term gcc glitch".

__always_inline is one of those things where a usage site should always
be commented, because it's near impossible to work out why someone
chose to use it. Quick, tell me what's happening in include/linux/slab.h.

Perhaps we should do

* Comment goes here. It is very specific about gcc versions.
#define inline_for_broken_gcc __always_inline

and then use inline_for_broken_gcc everywhere. That way, the reason
for the marker is self-explanatory and we can later hunt all these
things down and remvoe them.

Also, the inline_for_broken_gcc definition can be made dependent on
particular gcc versions, which will allow us to easily keep an eye on
the behaviour of later gcc versions.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at