Re: [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls

From: BjÃrn TÃpel
Date: Tue Feb 04 2020 - 14:13:32 EST


On Tue, 28 Jan 2020 at 03:15, Palmer Dabbelt <palmerdabbelt@xxxxxxxxxx> wrote:
>
> On arm64, the BPF function ABI doesn't match the C function ABI. Specifically,
> arm64 encodes calls as `a0 = f(a0, a1, ...)` while BPF encodes calls as
> `BPF_REG_0 = f(BPF_REG_1, BPF_REG_2, ...)`. This discrepancy results in
> function calls being encoded as a two operations sequence that first does a C
> ABI calls and then moves the return register into the right place. This
> results in one extra instruction for every function call.
>

It's a lot of extra work for one reg-to-reg move, but it always
annoyed me in the RISC-V JIT. :-) So, if it *can* be avoided, why not.

[...]
>
> +static int dead_register(const struct jit_ctx *ctx, int offset, int bpf_reg)

Given that a lot of archs (RISC-V, arm?, MIPS?) might benefit from
this, it would be nice if it could be made generic (it already is
pretty much), and moved to kernel/bpf.

> +{
> + const struct bpf_prog *prog = ctx->prog;
> + int i;
> +
> + for (i = offset; i < prog->len; ++i) {
> + const struct bpf_insn *insn = &prog->insnsi[i];
> + const u8 code = insn->code;
> + const u8 bpf_dst = insn->dst_reg;
> + const u8 bpf_src = insn->src_reg;
> + const int writes_dst = !((code & BPF_ST) || (code & BPF_STX)
> + || (code & BPF_JMP32) || (code & BPF_JMP));
> + const int reads_dst = !((code & BPF_LD));
> + const int reads_src = true;
> +
> + /* Calls are a bit special in that they clobber a bunch of regisers. */
> + if ((code & (BPF_JMP | BPF_CALL)) || (code & (BPF_JMP | BPF_TAIL_CALL)))
> + if ((bpf_reg >= BPF_REG_0) && (bpf_reg <= BPF_REG_5))
> + return false;
> +
> + /* Registers that are read before they're written are alive.
> + * Most opcodes are of the form DST = DEST op SRC, but there
> + * are some exceptions.*/
> + if (bpf_src == bpf_reg && reads_src)
> + return false;
> +
> + if (bpf_dst == bpf_reg && reads_dst)
> + return false;
> +
> + if (bpf_dst == bpf_reg && writes_dst)
> + return true;
> +
> + /* Most BPF instructions are 8 bits long, but some ar 16 bits
> + * long. */

A bunch of spelling errors above.


Cheers,
BjÃrn