Re: [PATCH] static_call,x86: Robustify trampoline patching
From: Ard Biesheuvel
Date: Sun Oct 31 2021 - 12:44:19 EST
On Sun, 31 Oct 2021 at 17:39, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Sun, Oct 31, 2021 at 05:24:13PM +0100, Ard Biesheuvel wrote:
> > On Sat, 30 Oct 2021 at 20:55, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
> > >
> > > On Sat, 30 Oct 2021 at 20:03, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Sat, Oct 30, 2021 at 07:19:53PM +0200, Ard Biesheuvel wrote:
> > > > > I just realized that arm64 has the exact same problem, which is not
> > > > > being addressed by my v5 of the static call support patch.
> > > >
> > > > Yeah, it would.
> > > >
> > > > > As it turns out, the v11 Clang that I have been testing with is broken
> > > > > wrt BTI landing pads, and omits them from the jump table entries.
> > > > > Clang 12+ adds them properly, which means that both the jump table
> > > > > entry and the static call trampoline may start with BTI C + direct
> > > > > branch, and we also need additional checks to disambiguate.
> > > >
> > > > I'm not sure, why would the static_call trampoline need a BTI C ? The
> > > > whole point of static_call() is to be a direct call, we should never
> > > > have an indirect call to the trampoline, that would defeat the whole
> > > > purpose.
> > >
> > > This might happen when the distance between the caller and the
> > > trampoline is more than 128 MB, in which case we emit a veneer that
> > > uses an indirect call as well. So we definitely need the landing pad
> > > in the trampoline.
> >
> > Something like the below seems to work to prevent getting the wrong
> > trampoline address into arch_static_call_transform:
>
> Is is also a terriblly gross hack. I really want the clang-cfi stuff to
> improve, not add layers of hacks on top of it.
I'm just as annoyed as you are about the apparent need for this.
However, emitting an alias at build time is far better IMHO than
adding a magic byte sequence and having to check it at runtime.