Re: [bpf-next v6 4/5] bpf, x86: Emit ENDBR for indirect jump targets

From: Alexei Starovoitov

Date: Fri Mar 06 2026 - 22:34:39 EST


On Fri, Mar 6, 2026 at 7:15 PM Xu Kuohai <xukuohai@xxxxxxxxxxxxxxx> wrote:
>
> On 3/7/2026 9:36 AM, Eduard Zingerman wrote:
> > On Fri, 2026-03-06 at 18:23 +0800, Xu Kuohai wrote:
> >> From: Xu Kuohai <xukuohai@xxxxxxxxxx>
> >>
> >> On CPUs that support CET/IBT, the indirect jump selftest triggers
> >> a kernel panic because the indirect jump targets lack ENDBR
> >> instructions.
> >>
> >> To fix it, emit an ENDBR instruction to each indirect jump target. Since
> >> the ENDBR instruction shifts the position of original jited instructions,
> >> fix the instruction address calculation wherever the addresses are used.
> >>
> >> For reference, below is a sample panic log.
> >>
> >> Missing ENDBR: bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
> >> ------------[ cut here ]------------
> >> kernel BUG at arch/x86/kernel/cet.c:133!
> >> Oops: invalid opcode: 0000 [#1] SMP NOPTI
> >>
> >> ...
> >>
> >> ? 0xffffffffc00fb258
> >> ? bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
> >> bpf_prog_test_run_syscall+0x110/0x2f0
> >> ? fdget+0xba/0xe0
> >> __sys_bpf+0xe4b/0x2590
> >> ? __kmalloc_node_track_caller_noprof+0x1c7/0x680
> >> ? bpf_prog_test_run_syscall+0x215/0x2f0
> >> __x64_sys_bpf+0x21/0x30
> >> do_syscall_64+0x85/0x620
> >> ? bpf_prog_test_run_syscall+0x1e2/0x2f0
> >>
> >> Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
> >> Signed-off-by: Xu Kuohai <xukuohai@xxxxxxxxxx>
> >> ---
> >> arch/x86/net/bpf_jit_comp.c | 23 +++++++++++++++--------
> >> 1 file changed, 15 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> >> index 2c57ee446fc9..752331a64fc0 100644
> >> --- a/arch/x86/net/bpf_jit_comp.c
> >> +++ b/arch/x86/net/bpf_jit_comp.c
> >> @@ -1658,8 +1658,8 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
> >> return 0;
> >> }
> >>
> >> -static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
> >> - int oldproglen, struct jit_context *ctx, bool jmp_padding)
> >> +static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *addrs, u8 *image,
> >> + u8 *rw_image, int oldproglen, struct jit_context *ctx, bool jmp_padding)
> >> {
> >> bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
> >> struct bpf_insn *insn = bpf_prog->insnsi;
> >> @@ -1743,6 +1743,11 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
> >> dst_reg = X86_REG_R9;
> >> }
> >>
> >> +#ifdef CONFIG_X86_KERNEL_IBT
> >> + if (bpf_insn_is_indirect_target(env, bpf_prog, i - 1))
> >> + EMIT_ENDBR();
> >> +#endif
> >> +
> >> switch (insn->code) {
> >> /* ALU */
> >> case BPF_ALU | BPF_ADD | BPF_X:
> >> @@ -2449,7 +2454,7 @@ st: if (is_imm8(insn->off))
> >>
> >> /* call */
> >> case BPF_JMP | BPF_CALL: {
> >> - u8 *ip = image + addrs[i - 1];
> >> + u8 *ip = image + addrs[i - 1] + (prog - temp);
> >
> > Sorry, meant to reply to v5 but got distracted.
> > It seems tedious/error prone to have this addend at each location,
> > would it be possible to move the 'ip' variable calculation outside
> > of the switch? It appears that at each point there would be no
> > EMIT invocations between 'ip' computation and usage.
> >
>
> Besides the changes shown in this patch, there is another line in the
> file computing address using 'image + addrs[i - 1] + (prog - temp)'.
>
> It is at the call to emit_return() in the 'BPF_JMP | BPF_EXIT' case.
> But there are indeed EMIT*() invocations before the address copmutation,
> so a pre-computed address before the switch statement is stale in
> this case.
>
> To fix this, how about introducing a macro for the address computation,
> as the following diff (based on this patch) shows:
>
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1606,6 +1606,8 @@ static void emit_priv_frame_ptr(u8 **pprog, void __percpu *priv_frame_ptr)
>
> #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
>
> +#define CURR_IP (image + addrs[i - 1] + (prog - temp))
> +

No. Don't obfuscate it with macro.
I don't like INSN_SZ_DIFF either.