RE: [PATCH 2/4] arm64: implement support for static call trampolines

From: David Laight
Date: Sat Sep 25 2021 - 13:46:36 EST


From: Mark Rutland
> Sent: 21 September 2021 17:28
>
> On Tue, Sep 21, 2021 at 05:55:11PM +0200, Ard Biesheuvel wrote:
> > On Tue, 21 Sept 2021 at 17:33, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > >
> > > On Tue, Sep 21, 2021 at 04:44:56PM +0200, Ard Biesheuvel wrote:
> > > > On Tue, 21 Sept 2021 at 09:10, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > ...
...
> > >
> > > I think so, yes. We can do sligntly better with an inline literal pool
> > > and a PC-relative LDR to fold the ADRP+LDR, e.g.
> > >
> > > .align 3
> > > tramp:
> > > BTI C
> > > {B <func> | RET | NOP}
> > > LDR X16, 1f
> > > BR X16
> > > 1: .quad <literal>
> > >
> > > Since that's in the .text, it's RO for regular accesses anyway.
> > >
> >
> > I tried to keep the literal in .rodata to avoid inadvertent gadgets
> > and/or anticipate exec-only mappings of .text, but that may be a bit
> > overzealous.
>
> I think that in practice the risk of gadgetisation is minimal, and
> having it inline means we only need to record a single address per
> trampoline, so there's less risk that we get the patching wrong.

But doesn't that mean that it is almost certainly a data cache miss?
You really want an instruction that reads the constant from the I-cache.
Or at least be able to 'bunch together' the constants so they
stand a chance of sharing a D-cache line.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)