Re: [PATCH v6 2/2] arm64: implement support for static call trampolines
From: Ard Biesheuvel
Date: Mon Nov 08 2021 - 06:29:18 EST
On Mon, 8 Nov 2021 at 11:23, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Fri, Nov 05, 2021 at 03:59:17PM +0100, Ard Biesheuvel wrote:
> > diff --git a/arch/arm64/include/asm/static_call.h b/arch/arm64/include/asm/static_call.h
> > new file mode 100644
> > index 000000000000..6ee918991510
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/static_call.h
> > @@ -0,0 +1,40 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_STATIC_CALL_H
> > +#define _ASM_STATIC_CALL_H
> > +
> > +/*
> > + * The sequence below is laid out in a way that guarantees that the literal and
> > + * the instruction are always covered by the same cacheline, and can be updated
> > + * using a single store-pair instruction (provided that we rewrite the BTI C
> > + * instruction as well). This means the literal and the instruction are always
> > + * in sync when observed via the D-side.
> > + *
> > + * However, this does not guarantee that the I-side will catch up immediately
> > + * as well: until the I-cache maintenance completes, CPUs may branch to the old
> > + * target, or execute a stale NOP or RET. We deal with this by writing the
> > + * literal unconditionally, even if it is 0x0 or the branch is in range. That
> > + * way, a stale NOP will fall through and call the new target via an indirect
> > + * call. Stale RETs or Bs will be taken as before, and branch to the old
> > + * target.
> > + */
>
> Thanks for the comment!
>
>
> > diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> > index 771f543464e0..a265a87d4d9e 100644
> > --- a/arch/arm64/kernel/patching.c
> > +++ b/arch/arm64/kernel/patching.c
>
>
> > +static void *strip_cfi_jt(void *addr)
> > +{
> > + if (IS_ENABLED(CONFIG_CFI_CLANG)) {
> > + void *p = addr;
> > + u32 insn;
> > +
> > + /*
> > + * Taking the address of a function produces the address of the
> > + * jump table entry when Clang CFI is enabled. Such entries are
> > + * ordinary jump instructions, preceded by a BTI C instruction
> > + * if BTI is enabled for the kernel.
> > + */
> > + if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> > + p += 4;
>
> Perhaps:
> if (aarch64_insn_is_bti(le32_to_cpup(p)))
That instruction does not exist yet, and it begs the question which
type of BTI instruction we want to detect.
> p += 4;
>
> Perhapser still, add:
> else
> WARN_ON(IS_ENABLED(CONFIG_ARM64_BTI_KERNEL));
>
There's already a WARN() below that will trigger and return the
original address if the entry did not have the expected layout, which
means a direct branch at offset 0x0 or 0x4 depending on whether BTI is
on.
So I could add a WARN() here as well, but I'd prefer to keep the one
at the bottom, which makes the one here slightly redundant.
> > +
> > + insn = le32_to_cpup(p);
> > + if (aarch64_insn_is_b(insn))
> > + return p + aarch64_get_branch_offset(insn);
> > +
> > + WARN_ON(1);
> > + }
> > + return addr;
> > +}
>
> Also, can this please have a comment decrying the lack of built-in for
> this?
Sure.