Re: [PATCH bpf-next v9 3/4] bpf, arm64: Implement bpf_arch_text_poke() for arm64

From: Jon Hunter
Date: Mon Jul 18 2022 - 09:53:12 EST



On 11/07/2022 16:08, Xu Kuohai wrote:
Implement bpf_arch_text_poke() for arm64, so bpf prog or bpf trampoline
can be patched with it.

When the target address is NULL, the original instruction is patched to a
NOP.

When the target address and the source address are within the branch
range, the original instruction is patched to a bl instruction to the
target address directly.

To support attaching bpf trampoline to both regular kernel function and
bpf prog, we follow the ftrace patchsite way for bpf prog. That is, two
instructions are inserted at the beginning of bpf prog, the first one
saves the return address to x9, and the second is a nop which will be
patched to a bl instruction when a bpf trampoline is attached.

However, when a bpf trampoline is attached to bpf prog, the distance
between target address and source address may exceed 128MB, the maximum
branch range, because bpf trampoline and bpf prog are allocated
separately with vmalloc. So long jump should be handled.

When a bpf prog is constructed, a plt pointing to empty trampoline
dummy_tramp is placed at the end:

bpf_prog:
mov x9, lr
nop // patchsite
...
ret

plt:
ldr x10, target
br x10
target:
.quad dummy_tramp // plt target

This is also the state when no trampoline is attached.

When a short-jump bpf trampoline is attached, the patchsite is patched to
a bl instruction to the trampoline directly:

bpf_prog:
mov x9, lr
bl <short-jump bpf trampoline address> // patchsite
...
ret

plt:
ldr x10, target
br x10
target:
.quad dummy_tramp // plt target

When a long-jump bpf trampoline is attached, the plt target is filled with
the trampoline address and the patchsite is patched to a bl instruction to
the plt:

bpf_prog:
mov x9, lr
bl plt // patchsite
...
ret

plt:
ldr x10, target
br x10
target:
.quad <long-jump bpf trampoline address>

dummy_tramp is used to prevent another CPU from jumping to an unknown
location during the patching process, making the patching process easier.

The patching process is as follows:

1. when neither the old address or the new address is a long jump, the
patchsite is replaced with a bl to the new address, or nop if the new
address is NULL;

2. when the old address is not long jump but the new one is, the
branch target address is written to plt first, then the patchsite
is replaced with a bl instruction to the plt;

3. when the old address is long jump but the new one is not, the address
of dummy_tramp is written to plt first, then the patchsite is replaced
with a bl to the new address, or a nop if the new address is NULL;

4. when both the old address and the new address are long jump, the
new address is written to plt and the patchsite is not changed.

Signed-off-by: Xu Kuohai <xukuohai@xxxxxxxxxx>
Acked-by: Song Liu <songliubraving@xxxxxx>
Reviewed-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx>
Reviewed-by: KP Singh <kpsingh@xxxxxxxxxx>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx>
---
arch/arm64/net/bpf_jit.h | 7 +
arch/arm64/net/bpf_jit_comp.c | 329 ++++++++++++++++++++++++++++++++--
2 files changed, 322 insertions(+), 14 deletions(-)


This change appears to be causing the build to fail ...

/tmp/cc52xO0c.s: Assembler messages:
/tmp/cc52xO0c.s:8: Error: operand 1 should be an integer register -- `mov lr,x9'
/tmp/cc52xO0c.s:7: Error: undefined symbol lr used as an immediate value
make[2]: *** [scripts/Makefile.build:250: arch/arm64/net/bpf_jit_comp.o] Error 1
make[1]: *** [scripts/Makefile.build:525: arch/arm64/net] Error 2

Let me know if you have any thoughts.

Cheers
Jon

--
nvpublic