Re: [PATCH] static_call,x86: Robustify trampoline patching

From: Peter Zijlstra
Date: Sun Oct 31 2021 - 12:39:50 EST


On Sun, Oct 31, 2021 at 05:24:13PM +0100, Ard Biesheuvel wrote:
> On Sat, 30 Oct 2021 at 20:55, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> >
> > On Sat, 30 Oct 2021 at 20:03, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Sat, Oct 30, 2021 at 07:19:53PM +0200, Ard Biesheuvel wrote:
> > > > I just realized that arm64 has the exact same problem, which is not
> > > > being addressed by my v5 of the static call support patch.
> > >
> > > Yeah, it would.
> > >
> > > > As it turns out, the v11 Clang that I have been testing with is broken
> > > > wrt BTI landing pads, and omits them from the jump table entries.
> > > > Clang 12+ adds them properly, which means that both the jump table
> > > > entry and the static call trampoline may start with BTI C + direct
> > > > branch, and we also need additional checks to disambiguate.
> > >
> > > I'm not sure, why would the static_call trampoline need a BTI C ? The
> > > whole point of static_call() is to be a direct call, we should never
> > > have an indirect call to the trampoline, that would defeat the whole
> > > purpose.
> >
> > This might happen when the distance between the caller and the
> > trampoline is more than 128 MB, in which case we emit a veneer that
> > uses an indirect call as well. So we definitely need the landing pad
> > in the trampoline.
>
> Something like the below seems to work to prevent getting the wrong
> trampoline address into arch_static_call_transform:

Is is also a terriblly gross hack. I really want the clang-cfi stuff to
improve, not add layers of hacks on top of it.