Re: [PATCH v6 2/4] bpf: Add bpf_user_ringbuf_drain() helper

From: Andrii Nakryiko
Date: Wed Sep 21 2022 - 19:33:38 EST


On Mon, Sep 19, 2022 at 5:01 PM David Vernet <void@xxxxxxxxxxxxx> wrote:
>
> In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which
> will allow user-space applications to publish messages to a ring buffer
> that is consumed by a BPF program in kernel-space. In order for this
> map-type to be useful, it will require a BPF helper function that BPF
> programs can invoke to drain samples from the ring buffer, and invoke
> callbacks on those samples. This change adds that capability via a new BPF
> helper function:
>
> bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void *ctx,
> u64 flags)
>
> BPF programs may invoke this function to run callback_fn() on a series of
> samples in the ring buffer. callback_fn() has the following signature:
>
> long callback_fn(struct bpf_dynptr *dynptr, void *context);
>
> Samples are provided to the callback in the form of struct bpf_dynptr *'s,
> which the program can read using BPF helper functions for querying
> struct bpf_dynptr's.
>
> In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register
> type is added to the verifier to reflect a dynptr that was allocated by
> a helper function and passed to a BPF program. Unlike PTR_TO_STACK
> dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR
> dynptrs need not use reference tracking, as the BPF helper is trusted to
> properly free the dynptr before returning. The verifier currently only
> supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL.
>
> Note that while the corresponding user-space libbpf logic will be added
> in a subsequent patch, this patch does contain an implementation of the
> .map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This
> .map_poll() callback guarantees that an epoll-waiting user-space
> producer will receive at least one event notification whenever at least
> one sample is drained in an invocation of bpf_user_ringbuf_drain(),
> provided that the function is not invoked with the BPF_RB_NO_WAKEUP
> flag. If the BPF_RB_FORCE_WAKEUP flag is provided, a wakeup
> notification is sent even if no sample was drained.
>
> Signed-off-by: David Vernet <void@xxxxxxxxxxxxx>
> ---
> include/linux/bpf.h | 11 +-
> include/uapi/linux/bpf.h | 38 +++++++
> kernel/bpf/helpers.c | 2 +
> kernel/bpf/ringbuf.c | 181 ++++++++++++++++++++++++++++++++-
> kernel/bpf/verifier.c | 61 ++++++++++-
> tools/include/uapi/linux/bpf.h | 38 +++++++
> 6 files changed, 320 insertions(+), 11 deletions(-)

[...]

> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
> @@ -5599,6 +5636,7 @@ union bpf_attr {
> FN(tcp_raw_check_syncookie_ipv4), \
> FN(tcp_raw_check_syncookie_ipv6), \
> FN(ktime_get_tai_ns), \
> + FN(user_ringbuf_drain), \
> /* */
>
> /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 41aeaf3862ec..66217b1857ca 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1627,6 +1627,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
> return &bpf_dynptr_write_proto;
> case BPF_FUNC_dynptr_data:
> return &bpf_dynptr_data_proto;
> + case BPF_FUNC_user_ringbuf_drain:
> + return &bpf_user_ringbuf_drain_proto;

In light of [0], where we now allow dynptr only with CAP_BPF, I've
moved this lower behind CAP_BPF check while applying. Thanks!

[0] https://patchwork.kernel.org/project/netdevbpf/patch/20220921143550.30247-1-memxor@xxxxxxxxx/

> default:
> break;
> }

[...]