Re: PROPOSAL: Extend inline asm syntax with size spec

From: Richard Biener
Date: Thu Oct 11 2018 - 03:05:07 EST


On Wed, 10 Oct 2018, Nadav Amit wrote:

> at 7:53 AM, Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Mon, Oct 08, 2018 at 11:07:46AM +0200, Richard Biener wrote:
> >> On Mon, 8 Oct 2018, Segher Boessenkool wrote:
> >>> On Sun, Oct 07, 2018 at 03:53:26PM +0000, Michael Matz wrote:
> >>>> On Sun, 7 Oct 2018, Segher Boessenkool wrote:
> >>>>> On Sun, Oct 07, 2018 at 11:18:06AM +0200, Borislav Petkov wrote:
> >>>>>> Now, Richard suggested doing something like:
> >>>>>>
> >>>>>> 1) inline asm ("...")
> >>>>>
> >>>>> What would the semantics of this be?
> >>>>
> >>>> The size of the inline asm wouldn't be counted towards the inliner size
> >>>> limits (or be counted as "1").
> >>>
> >>> That sounds like a good option.
> >>
> >> Yes, I also like it for simplicity. It also avoids the requirement
> >> of translating the number (in bytes?) given by the user to
> >> "number of GIMPLE instructions" as needed by the inliner.
> >
> > This patch implements this, for C only so far. And the syntax is
> > "asm inline", which is more in line with other syntax.
> >
> > How does this look?
>
> It looks good to me in general. I have a couple of reservations, but I
> suspect you will not want to address them:
>
> 1. It is not backward compatible, requiring a C macro to wrap it, as the
> kernel might be built with different compilers.
>
> 2. It is specific to asm. I do not have in mind another use case (excluding
> the __builtin_constant_p), but it would be nicer IMHO to have a builtin
> saying âignore the cost of this statementâ for the matter of optimizations.

The only easy possibility that comes to my mid is sth like

__attribute__((always_inline, zero_cost)) foo ()
{
... your stmts ...
}

and us, upon inlining, marking the inlined stmts properly. That would
also work for the asm() case and it would be backwards compatible
(well, you'd get a diagnostic for the unknown zero_cost attribute).

There's the slight complication that if you have, say

_1 = _2 * 3; // zero-cost
_4 = _1 * 2;

and optimization ends up combining those to

_4 = _2 * 6;

then is this stmt supposed to be zero-cost or not? Compare that to

_1 = _2 * 3;
_4 = _1 * 2; // zero-cost

So outside of asm() there are new issues that come up with respect
to expected (cost) semantics.

Richard.