[PATCH bpf-next v1 0/7] bpf/sockmap: add splice support for tcp_bpf
From: Jiayuan Chen
Date: Wed Mar 04 2026 - 01:37:42 EST
Starting from Go 1.22.0, TCPConn implements the WriteTo interface [1],
which internally uses the splice(2) syscall to transfer data between
file descriptors [2].
However, for sockets with sockmap enabled, sk_prot is replaced with
tcp_bpf_prots which does not provide a splice_read callback. When data
is redirected to a socket's psock ingress queue via bpf_msg_redirect,
splice(2) cannot read from it because the splice path has no knowledge
of the psock queue. This causes TCPConn.WriteTo to return 0 bytes,
effectively breaking Go applications that rely on io.Copy between TCP
connections when sockmap/BPF is in use [3].
The simplest fix would be registering a splice callback that just calls
copy_splice_read(), but this results in redundant copies (socket -> kernel
buffer -> pipe -> destination), which defeats the purpose of splice.
Patch 1 adds splice_read to struct proto and sets it in TCP.
Patch 2 adds inet_splice_read and uses it in inet_stream_ops.
Patch 3 refactors tcp_bpf recvmsg with a read actor abstraction.
Patch 4 adds basic splice_read support for sockmap, but this still
involves 2 data copies.
Patch 5 optimizes the splice implementation by transferring page
ownership directly into the pipe, achieving true zero-copy. Benchmarks
show performance on par with the read(2) path.
Patch 6 adds splice selftests. Since splice can seamlessly replace read
operations, we redefine read to splice in the existing selftests so
that all existing test cases also cover the splice path.
Patch 7 adds splice to the sockmap benchmark, which also serves to
verify the effectiveness of our zero-copy implementation.
Benchmark results with rx-verdict-ingress mode (loopback, 8 CPUs):
read(2): ~4292 MB/s
splice(2) + zero-copy: ~4270 MB/s
splice(2) + always-copy: ~2770 MB/s
Zero-copy splice achieves near-parity with read(2), while the
always-copy fallback is ~35% slower.
[1] https://github.com/golang/go/blob/master/src/net/tcpsock.go#L173
[2] https://github.com/golang/go/blob/fdf3bee/src/net/tcpsock_posix.go#L57
[3] https://github.com/jschwinger233/bpf_msg_redirect_bug_reproducer
Jiayuan Chen (7):
net: add splice_read to struct proto and set it in tcp_prot/tcpv6_prot
inet: add inet_splice_read() and use it in
inet_stream_ops/inet6_stream_ops
tcp_bpf: refactor recvmsg with read actor abstraction
tcp_bpf: add splice_read support for sockmap
tcp_bpf: optimize splice_read with zero-copy for non-slab pages
selftests/bpf: add splice_read tests for sockmap
selftests/bpf: add splice option to sockmap benchmark
include/linux/skmsg.h | 12 +-
include/net/inet_common.h | 3 +
include/net/sock.h | 3 +
net/core/skmsg.c | 34 ++-
net/ipv4/af_inet.c | 15 +-
net/ipv4/tcp_bpf.c | 227 +++++++++++++++---
net/ipv4/tcp_ipv4.c | 1 +
net/ipv6/af_inet6.c | 2 +-
net/ipv6/tcp_ipv6.c | 1 +
.../selftests/bpf/benchs/bench_sockmap.c | 57 ++++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-
.../bpf/prog_tests/sockmap_helpers.h | 62 +++++
.../selftests/bpf/prog_tests/sockmap_strp.c | 28 ++-
13 files changed, 421 insertions(+), 52 deletions(-)
--
2.43.0