Re: [PATCH] compiler, clang: Add always_inline attribute to inline
From: Mark Rutland
Date: Tue Jun 20 2017 - 06:52:56 EST
On Mon, Jun 19, 2017 at 02:42:23PM -0700, David Rientjes wrote:
> On Mon, 19 Jun 2017, Sodagudi Prasad wrote:
>
> > > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
> > > > static inline functions") re-defining the 'inline' macro but
> > > > __attribute__((always_inline)) is missing. Some compilers may
> > > > not honor inline hint if always_iniline attribute not there.
> > > > So add always_inline attribute to inline as done by
> > > > compiler-gcc.h file.
> > > >
> > >
> > > IIUC, __attribute__((always_inline)) was only needed for gcc versions < 4
> > > and that the inlining decision making is improved in >= 4. To make a
> > > change like this, I would think that we would need to show that clang is
> > > making suboptimal decisions. I don't think there's a downside to making
> > > CONFIG_OPTIMIZE_INLINING specific only to gcc.
> > >
> > > If it is shown that __attribute__((always_inline)) is needed for clang as
> > > well, this should be done as part of compiler-gcc.h to avoid duplicated
> > > code.
> >
> > Hi David,
> >
> > Here is the discussion about this change -
> > https://lkml.org/lkml/2017/6/15/396
> > Please check mark and will's comments.
> >
>
> Yes, the arch/arm64/include/asm/cmpxchg.h instance appears to need
> __always_inline as several other functions need __always_inline in
> arch/arm64/include/*. It's worth making that change as you suggested in
> your original patch.
>
> The concern, however, is inlining all "inline" functions forcefully. The
> only reason this is done for gcc is because of suboptimal inlining
> decisions in gcc < 4.
>
> So the question is whether this is a single instance that can be fixed
> where clang un-inlining causes problems or whether that instance suggests
> all possible inline usage for clang absolutely requires __always_inline
> due to a suboptimal compiler implementation. I would suggest the former.
My concern here is that code has been written with the implicit
assumption that inline means __always_inline, since that's been the case
for years with GCC, when !ARCH_SUPPORTS_OPTIMIZED_INLINING ||
!CONFIG_OPTIMIZE_INLINING, (i.e. for every !x86 arch).
While this is the only breakage seen so far, it seems likely that
similar breakage may exist elsewhere, and such breakage may easily be
introduced by those only using GCC.
I'd prefer to use the same guards for clang here, since that ensures
that such code works by default across both compilers. That gives us the
chance to test and fixup code without a violent flag day.
Once we've fixed up the core arm64 code, we can select
ARCH_SUPPORTS_OPTIMIZED_INLINING, and allow users to optionally select
CONFIG_OPTIMIZE_INLINING (with either compiler).
Once that's seen some testing, and if there's a benefit, then we can try
to align with x86 and default to selecting CONFIG_OPTIMIZE_INLINING,
and/or drop the config options entirely and only check the GCC version.
Thanks,
Mark.