Re: [PATCH v7] arm64: implement support for static call trampolines

From: Peter Zijlstra

Date: Tue Mar 17 2026 - 07:38:24 EST


On Tue, Mar 17, 2026 at 12:31:51PM +0100, Ard Biesheuvel wrote:
>
> On Tue, 17 Mar 2026, at 12:24, Peter Zijlstra wrote:
> > On Tue, Mar 17, 2026 at 11:59:49AM +0100, Ard Biesheuvel wrote:
> >
> >> > +#define __ARCH_DEFINE_STATIC_CALL_TRAMP(name, target) \
> >> > + asm(" .pushsection .static_call.text, \"ax\" \n" \
> >> > + " .align 3 \n" \
> >> > + " .globl " STATIC_CALL_TRAMP_STR(name) " \n" \
> >> > + STATIC_CALL_TRAMP_STR(name) ": \n" \
> >> > + " hint 34 /* BTI C */ \n" \
> >> > + " adrp x16, 1f \n" \
> >> > + " ldr x16, [x16, :lo12:1f] \n" \
> >> > + " cbz x16, 0f \n" \
> >> > + " br x16 \n" \
> >> > + "0: ret \n" \
> >> > + " .type " STATIC_CALL_TRAMP_STR(name) ", %function \n" \
> >> > + " .size " STATIC_CALL_TRAMP_STR(name) ", . - "
> >> > STATIC_CALL_TRAMP_STR(name) " \n" \
> >> > + " .popsection \n" \
> >> > + " .pushsection .rodata, \"a\" \n" \
> >> > + " .align 3 \n" \
> >> > + "1: .quad " target " \n" \
> >> > + " .popsection \n")
> >
> >> > +void arch_static_call_transform(void *site, void *tramp, void *func,
> >> > bool tail)
> >> > +{
> >> > + u64 literal;
> >> > + int ret;
> >> > +
> >>
> >> Here, set func to &__static_call_return0 if it is NULL.
> >
> > I'm confused. NULL is for the static_call_cond() case, where we NO-OP.
> > And the trampoline above does that cbz 0f to ret. So far so good.
> >
> > But ret0 should return 0, and IIRC arm64 uses x0 for the return value.
> > But I don't see the above clearing it. Hmm?
> >

(your MUA seems busted and generates these silly long lines, let me
reflow again)

> The RET0 case will tail call __static_call_return0() which will take
> care of this.
>
> I am just saying that the NULL case could just do the same, rather
> than have a conditional branch in the trampoline, as even in that
> case, the surrounding code must assume that X0 is clobbered.
> Alternatively, we could tail call __static_call_nop(), which would do
> just the 'ret'.
>
> IOW, if we guarantee that the target is always set to something
> appropriate, we can elide the NULL check, and __static_call_return0()
> and __static_call_nop() are equally appropriate for the NULL case.

Ah, yes that is possible. Trade that cbz for an unconditional branch to
a function. That works fine.

Do whatever is best for the uarch etc..