Re: [PATCH v4 14/18] static_call: Add static_cond_call()
From: Rasmus Villemoes
Date: Mon May 04 2020 - 03:20:10 EST
On 03/05/2020 14.58, Peter Zijlstra wrote:
> On Sat, May 02, 2020 at 03:08:00PM +0200, Rasmus Villemoes wrote:
>> On 01/05/2020 22.29, Peter Zijlstra wrote:
>>> +#define static_cond_call(name) \
>>> + if (STATIC_CALL_KEY(name).func) \
>>> + ((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func))
>>> +
>>
>> This addresses neither the READ_ONCE issue nor the fact that,
>> AFAICT, the semantics of
>>
>> static_cond_call(foo)(i++)
>>
>> will depend on CONFIG_HAVE_STATIC_CALL.
>
> True.
>
> So there is something utterly terrible we can do to address both:
>
> void __static_call_nop(void)
> {
> }
>
> #define __static_cond_call(name) \
> ({ \
> void *func = READ_ONCE(STATIC_CALL_KEY(name).func); \
> if (!func) \
> func = &__static_call_nop; \
> (typeof(STATIC_CALL_TRAMP(name))*)func; \
> })
>
> #define static_cond_call(name) (void)__static_cond_call(name)
>
> This gets us into Undefined Behaviour territory, but it ought to work.
>
> It adds the READ_ONCE(), and it cures the argument evaluation issue.
Indeed, that is horrible. And it "fixes" the argument evaluation by
changing the !HAVE_STATIC_CALL case to match the HAVE_STATIC_CALL, not
the other way around, which means that it is not a direct equivalent to the
if (foo)
foo(a, b, c)
[which pattern of course has the READ_ONCE issue, but each individual
existing site with that may be ok for various reasons].
Is gcc smart enough to change the if (!func) to a jump across the
function call (but still evaluting side effects in args), or is
__static_call_nop actually emitted and called? If the latter, then one
might as well patch the write-side to do "WRITE_ONCE(foo, func ? :
__static_call_nop)" and elide the test from __static_cond_call() - in
fact, that just becomes a single READ_ONCE. [There's probably some
annoying issue with making sure static initialization of foo points at
__static_call_nop].
And that brings me to the other issue I raised - do you have a few
examples of call sites that could use this, so we can see disassembly
before/after? I'm still concerned that, even if there are no
side-effects in the arguments, you still force the compiler to
spill/shuffle registers for call/restore unconditionally, whereas with a
good'ol if(), all that work is guarded by the load+test.
Rasmus