Re: [PATCH 0/6] Macrofying inline assembly for better compilation

From: Nadav Amit
Date: Fri May 18 2018 - 09:19:36 EST


David Laight <David.Laight@xxxxxxxxxx> wrote:

> From: Nadav Amit
>> Sent: 17 May 2018 17:14
>> This patch-set deals with an interesting yet stupid problem: kernel code
>> that does not get inlined despite its simplicity. There are several
>> causes for this behavior: "cold" attribute on __init, different function
>> optimization levels; conditional constant computations based on
>> __builtin_constant_p(); and finally large inline assembly blocks.
>>
>> This patch-set deals with the inline assembly problem. I separated these
>> patches from the others (that were sent in the RFC) for easier
>> inclusion.
>>
>> The problem with inline assembly is that inline assembly is often used
>> by the kernel for things that are other than code - for example,
>> assembly directives and data. GCC however is oblivious to the content of
>> the blocks and assumes their cost in space and time is proportional to
>> the number of the perceived assembly "instruction", according to the
>> number of newlines and semicolons. Alternatives, paravirt and other
>> mechanisms are affected, causing code not to be inlined, and degrading
>> compilation quality in general.
>>
>> The solution that this patch-set carries for this problem is to create
>> an assembly macro, and then call it from the inline assembly block. As
>> a result, the compiler sees a single "instruction" and assigns the more
>> appropriate cost to the code. In addition, this patch-set removes
>> unneeded new-lines from common x86 inline asm's, which "confuse" GCC
>> heuristics.
>
> Can't you get the same effect by using always_inline ?

I wanted and forgot to mention in the cover-letter why always_inline is not
a proper solution:

1. It is not easy to go over 400 functions and mark them as __always_inline.
Maintaining it afterwards (i.e., removing the __always_inline if the
function is changed and becomes âheavier") is even harder.

2. The kernel can be configured in a many ways, which would make
functions more âcheaperâ or more âexpensiveâ, so you cannot always
predetermine whether a function should be inlined.

3. If you mark a function __always_inline you can just cause the calling
function not to be inlined (when it should be inlined as well). It becomes
a whack-a-mole.

4. It is not only about inlining. The compiler also makes branch decisions
based on the perceived cost of the code, including inlined function.

Regards,
Nadav