Re: [PATCH bpf-next v3 0/6] bpf trampoline support "jmp" mode
From: Leon Hwang
Date: Thu Apr 02 2026 - 02:13:47 EST
On Tue, Nov 18, 2025 at 08:36:28PM +0800, Menglong Dong wrote:
>For now, the bpf trampoline is called by the "call" instruction. However,
>it break the RSB and introduce extra overhead in x86_64 arch.
>
>For example, we hook the function "foo" with fexit, the call and return
>logic will be like this:
> call foo -> call trampoline -> call foo-body ->
> return foo-body -> return foo
>
>As we can see above, there are 3 call, but 2 return, which break the RSB
>balance. We can pseudo a "return" here, but it's not the best choice,
>as it will still cause once RSB miss:
> call foo -> call trampoline -> call foo-body ->
> return foo-body -> return dummy -> return foo
>
>The "return dummy" doesn't pair the "call trampoline", which can also
>cause the RSB miss.
>
>Therefore, we introduce the "jmp" mode for bpf trampoline, as advised by
>Alexei in [1]. And the logic will become this:
> call foo -> jmp trampoline -> call foo-body ->
> return foo-body -> return foo
>
>As we can see above, the RSB is totally balanced after this series.
>
Hi, this is a late footnote for this optimization.
As this optimization landed in the 6.19 kernel, the function graph feature
of bpfsnoop [1] cannot work because of the missing tracee's FP/IP for
fexit.
Before this optimization,
caller
-> call icmp_rcv caller IP/FP
-> call trampoline icmp_rcv IP/FP
-> call icmp_rcv body trampoline IP/FP
<- return to trampoline
<- return to caller
After this optimization,
caller
-> call icmp_rcv caller IP/FP
-> jump to trampoline
-> call icmp_rcv body trampoline IP/FP
<- return to trampoline
<- return to caller
As a result, the function call stack entry for icmp_rcv has gone.
It can be confirmed by bpf_get_stack*() helpers.
$ sudo bpfsnoop -k icmp_rcv --output-stack -v
In 6.14,
0xffff8000802bda44:bpfsnoop_fn+0x6a4
0xffff8000802bda44:bpfsnoop_fn+0x6a4
0xffff8000802bd064:bpf_trampoline_6442573163+0xa4
0xffffc7825c984df0:icmp_rcv+0x8
0xffffc7825c91bcb8:ip_protocol_deliver_rcu+0x48
0xffffc7825c91bfd4:ip_local_deliver_finish+0x8c
0xffffc7825c91c0d0:ip_local_deliver+0x88
In 6.19,
0xffffffffc0209069:bpfsnoop_fn+0x449
0xffffffffc01ef2a4:bpf_trampoline_6442568724+0x64
0xffffffffb1085cda:ip_protocol_deliver_rcu+0x1ea
0xffffffffb1085d96:ip_local_deliver_finish+0x86
0xffffffffb1085e95:ip_local_deliver+0x65
So, it would surprise users who care about the tracee entry.
[1] https://github.com/bpfsnoop/bpfsnoop
Thanks,
Leon
[...]