[RFC v2 00/23] io_uring BPF requests

From: Pavel Begunkov
Date: Wed May 19 2021 - 10:13:56 EST


The main problem solved is feeding completion information of other
requests in a form of CQEs back into BPF. I decided to wire up support
for multiple completion queues (aka CQs) and give BPF programs access to
them, so leaving userspace in control over synchronisation that should
be much more flexible that the link-based approach.

For instance, there can be a separate CQ for each BPF program, so no
extra sync is needed, and communication can be done by submitting a
request targeting a neighboring CQ or submitting a CQE there directly
(see test3 below). CQ is choosen by sqe->cq_idx, so everyone can
cross-fire if willing.

A bunch of other features was added to play around (see v1 changelog
below or test1), some are just experimental only. The interfaces are
not even close to settle.
Note: there are problems known, one may live-lock a task, unlikely
to happen but better to be aware.

For convenience git branch for the kernel part is at [1],
libbpf + examples [2]. Examples are written in restricted C and libbpf,
and are under examples/bpf/, see [3], with 4 BPF programs and 4
corresponding test cases in uring.c. It's already shaping interesting
to play with.

test1: just a set of use examples for features
test2/counting: ticks-react N times using timeout reqs and CQ waiting
test3/pingpong: two BPF reqs do message-based communication by
repeatedly writing a CQE to another program's CQ and
waiting for a response
test4/write_file: BPF writes N bytes to a file keeping QD>1

[1] https://github.com/isilence/linux/tree/ebpf_v2
[2] https://github.com/isilence/liburing/tree/ebpf_v2
[3] https://github.com/isilence/liburing/tree/ebpf_v2/examples/bpf

since v1:
- several bug fixes
- support multiple CQs
- allow BPF requests to wait on CQs
- BPF helpers for emit/reap CQE
- expose user_data to BPF program
- sleepable + let BPF read/write from userspace

Pavel Begunkov (23):
io_uring: shuffle rarely used ctx fields
io_uring: localise fixed resources fields
io_uring: remove dependency on ring->sq/cq_entries
io_uring: deduce cq_mask from cq_entries
io_uring: kill cached_cq_overflow
io_uring: rename io_get_cqring
io_uring: extract struct for CQ
io_uring: internally pass CQ indexes
io_uring: extract cq size helper
io_uring: add support for multiple CQs
io_uring: enable mmap'ing additional CQs
bpf: add IOURING program type
io_uring: implement bpf prog registration
io_uring: add support for bpf requests
io_uring: enable BPF to submit SQEs
io_uring: enable bpf to submit CQEs
io_uring: enable bpf to reap CQEs
libbpf: support io_uring
io_uring: pass user_data to bpf executor
bpf: Add bpf_copy_to_user() helper
io_uring: wire bpf copy to user
io_uring: don't wait on CQ exclusively
io_uring: enable bpf reqs to wait for CQs

fs/io_uring.c | 794 +++++++++++++++++++++++++++------
include/linux/bpf.h | 1 +
include/linux/bpf_types.h | 2 +
include/uapi/linux/bpf.h | 12 +
include/uapi/linux/io_uring.h | 15 +-
kernel/bpf/helpers.c | 17 +
kernel/bpf/syscall.c | 1 +
kernel/bpf/verifier.c | 5 +-
tools/include/uapi/linux/bpf.h | 7 +
tools/lib/bpf/libbpf.c | 7 +
10 files changed, 722 insertions(+), 139 deletions(-)

--
2.31.1