Re: [RFC][PATCH 1/5] static_call: Make NULL static calls consistent
From: Sean Christopherson
Date: Mon Mar 13 2023 - 11:09:02 EST
On Sun, Mar 12, 2023, Peter Zijlstra wrote:
> On Fri, Mar 10, 2023 at 05:20:04PM -0800, Josh Poimboeuf wrote:
> > On Fri, Mar 10, 2023 at 09:59:26PM +0100, Peter Zijlstra wrote:
> > > > -#define __static_call_cond(name) \
> > > > -({ \
> > > > - void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> > > > - if (!func) \
> > > > - func = &__static_call_nop; \
> > > > - (typeof(STATIC_CALL_TRAMP(name))*)func; \
> > > > -})
> > >
> > > So a sufficiently clever compiler can optimize the above to avoid the
> > > actual indirect call (and resulting CFI violation, see below), because
> > > __static_call_nop() is inline and hence visible as an empty stub
> > > function. Currently none of the compilers are that clever :/
> >
> > I won't hold my breath waiting for theoretical optimizations.
>
> Well, I'm thinking the clang folks might like this option to unbreak the
> arm64 build. At least here they have a fighting chance of actually doing
> the right thing.
>
> Let me Cc some actual compiler folks.
+Will and Kees too for the arm64+CFI mess.
https://lore.kernel.org/all/YfrQzoIWyv9lNljh@xxxxxxxxxx
> > > This will break ARM64 I think, they don't HAVE_STATIC_CALL but do have
> > > CLANG_CFI, which means the above will end up being a runtime indirect
> > > call to a non-matching signature function.
> > >
> > > Now, I suppose we don't actually have this happen in current code by the
> > > simple expedient of not actually having any static_call_cond() usage
> > > outside of arch code.
> > >
> > > (/me git-grep's some and *arrrggh* trusted-keys)
> > >
> > > I really don't think we can do this though, must not promote CFI
> > > violations.
> >
> > Ouch, so static_call_cond() and __static_call_return0() are broken today
> > on CFI_CLANG + arm64.
>
> Yes. Now __static_call_return0() should really only happen when
> HAVE_STATIC_CALL per the definition only being available in that case.
>
> And static_call_cond() as implemented today *might* just be fixable by
> the compiler.
>
> > Some ideas:
> >
> > 1) Implement HAVE_STATIC_CALL for arm64. IIRC, this wasn't worth the
> > effort due to restricted branch ranges and CFI fun.
>
> The powerpc32 thing did it, iirc a similar approach could work for arm.
> But this would basically mandate HAVE_STATIC_CALL for CFI_CLANG.
>
> >
> > 2) Create yet another "tier" of static call implementations, for
> > arches which can have the unfortunate combo of CFI_CLANG +
> > !HAVE_STATIC_CALL. CONFIG_ALMOST_DONT_HAVE_STATIC_CALL?
> >
> > The arch can define ARCH_DEFINE_STATIC_CALL_NOP() which uses inline
> > asm to create a CFI-compliant NOP/BUG/whatever version of the
> > function (insert lots of hand-waving). Is the kcfi hash available
> > to inline asm at build time?
>
> Yes, clang creates magic symbol for everything it sees a declaration
> for. This symbols can be referenced from asm, linking will make it all
> work.
>
> And yes, C sucks, you can't actually create a function definition from a
> type :/ Otherwise this could be trivially fixable.
>
> > 3) Use a jump label to bypass the static call instead of calling
> > __static_call_nop(). NOTE: I couldn't figure out how to do this
> > without angering the compiler, unless we want to change
> > static_call() back to the old-school interface:
> >
> > static_call(foo, args...)
> >
> > Is it Friday yet?
>
> Always right :-)
>
> And yes, the whole premise of all this is that we let the compiler
> generate the actuall CALL and then have objtool scan the output and
> report the locations of them. There is no way to intercept this at the
> compiler level.