Re: [PATCH v2 0/8] io_uring: Initial support for {s,g}etsockopt commands
From: Stanislav Fomichev
Date: Tue Aug 08 2023 - 14:24:07 EST
On 08/08, Breno Leitao wrote:
> This patchset adds support for getsockopt (SOCKET_URING_OP_GETSOCKOPT)
> and setsockopt (SOCKET_URING_OP_SETSOCKOPT) in io_uring commands.
> SOCKET_URING_OP_SETSOCKOPT implements generic case, covering all levels
> nad optnames. On the other hand, SOCKET_URING_OP_GETSOCKOPT just
> implements level SOL_SOCKET case, which seems to be the
> most common level parameter for get/setsockopt(2).
>
> struct proto_ops->setsockopt() uses sockptr instead of userspace
> pointers, which makes it easy to bind to io_uring. Unfortunately
> proto_ops->getsockopt() callback uses userspace pointers, except for
> SOL_SOCKET, which is handled by sk_getsockopt(). Thus, this patchset
> leverages sk_getsockopt() to imlpement the SOCKET_URING_OP_GETSOCKOPT
> case.
>
> In order to support BPF hooks, I modified the hooks to use sockptr, so,
> it is flexible enough to accept user or kernel pointers for
> optval/optlen.
>
> PS1: For getsockopt command, the optlen field is not a userspace
> pointers, but an absolute value, so this is slightly different from
> getsockopt(2) behaviour. The new optlen value is returned in cqe->res.
>
> PS2: The userspace pointers need to be alive until the operation is
> completed.
>
> These changes were tested with a new test[1] in liburing. On the BPF
> side, I tested that no regression was introduced by running "test_progs"
> self test using "sockopt" test case.
>
> [1] Link: https://github.com/leitao/liburing/blob/getsock/test/socket-getsetsock-cmd.c
>
> RFC -> V1:
> * Copy user memory at io_uring subsystem, and call proto_ops
> callbacks using kernel memory
> * Implement all the cases for SOCKET_URING_OP_SETSOCKOPT
I did a quick pass, will take a close look later today. So far everything makes
sense to me.
Should we properly test it as well?
We have tools/testing/selftests/bpf/prog_tests/sockopt.c which does
most of the sanity checks, but it uses regular socket/{g,s}etsockopt
syscalls. Seems like it should be pretty easy to extend this with
io_uring path? tools/testing/selftests/net/io_uring_zerocopy_tx.c
already implements minimal wrappers which we can most likely borrow.