Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent
From: Peter Zijlstra
Date: Sun Mar 12 2023 - 11:17:55 EST
On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> On Fri, Mar 10, 2023 at 09:59:26PM +0100, Peter Zijlstra wrote:
> > > -#define __static_call_cond(name) \
> > > -({ \
> > > - void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> > > - if (!func) \
> > > - func = &__static_call_nop; \
> > > - (typeof(STATIC_CALL_TRAMP(name))*)func; \
> > > -})
> >
> > So a sufficiently clever compiler can optimize the above to avoid the
> > actual indirect call (and resulting CFI violation, see below), because
> > __static_call_nop() is inline and hence visible as an empty stub
> > function. Currently none of the compilers are that clever :/
>
> I won't hold my breath waiting for theoretical optimizations.
Well, I'm thinking the clang folks might like this option to unbreak the
arm64 build. At least here they have a fighting chance of actually doing
the right thing.
Let me Cc some actual compiler folks.
> > This will break ARM64 I think, they don't HAVE_STATIC_CALL but do have
> > CLANG_CFI, which means the above will end up being a runtime indirect
> > call to a non-matching signature function.
> >
> > Now, I suppose we don't actually have this happen in current code by the
> > simple expedient of not actually having any static_call_cond() usage
> > outside of arch code.
> >
> > (/me git-grep's some and *arrrggh* trusted-keys)
> >
> > I really don't think we can do this though, must not promote CFI
> > violations.
>
> Ouch, so static_call_cond() and __static_call_return0() are broken today
> on CFI_CLANG + arm64.
Yes. Now __static_call_return0() should really only happen when
HAVE_STATIC_CALL per the definition only being available in that case.
And static_call_cond() as implemented today *might* just be fixable by
the compiler.
> Some ideas:
>
> 1) Implement HAVE_STATIC_CALL for arm64. IIRC, this wasn't worth the
> effort due to restricted branch ranges and CFI fun.
The powerpc32 thing did it, iirc a similar approach could work for arm.
But this would basically mandate HAVE_STATIC_CALL for CFI_CLANG.
>
> 2) Create yet another "tier" of static call implementations, for
> arches which can have the unfortunate combo of CFI_CLANG +
> !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
>
> The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> asm to create a CFI-compliant NOP/BUG/whatever version of the
> function (insert lots of hand-waving). Is the kcfi hash available
> to inline asm at build time?
Yes, clang creates magic symbol for everything it sees a declaration
for. This symbols can be referenced from asm, linking will make it all
work.
And yes, C sucks, you can't actually create a function definition from a
type :/ Otherwise this could be trivially fixable.
> 3) Use a jump label to bypass the static call instead of calling
> __static_call_nop(). NOTE: I couldn't figure out how to do this
> without angering the compiler, unless we want to change
> static_call() back to the old-school interface:
>
> static_call(foo, args...)
>
> Is it Friday yet?
Always right :-)
And yes, the whole premise of all this is that we let the compiler
generate the actuall CALL and then have objtool scan the output and
report the locations of them. There is no way to intercept this at the
compiler level.