Re: [PATCH 2/5] bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type
From: Andrii Nakryiko
Date: Thu Aug 11 2022 - 19:23:21 EST
On Mon, Aug 8, 2022 at 8:54 AM David Vernet <void@xxxxxxxxxxxxx> wrote:
>
> We want to support a ringbuf map type where samples are published from
> user-space to BPF programs. BPF currently supports a kernel -> user-space
> circular ringbuffer via the BPF_MAP_TYPE_RINGBUF map type. We'll need to
> define a new map type for user-space -> kernel, as none of the helpers
> exported for BPF_MAP_TYPE_RINGBUF will apply to a user-space producer
> ringbuffer, and we'll want to add one or more helper functions that would
> not apply for a kernel-producer ringbuffer.
>
> This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
> definition. The map type is useless in its current form, as there is no way
> to access or use it for anything until we add more BPF helpers. A follow-on
> patch will therefore add a new helper function that allows BPF programs to
> run callbacks on samples that are published to the ringbuffer.
>
> Signed-off-by: David Vernet <void@xxxxxxxxxxxxx>
> ---
> include/linux/bpf_types.h | 1 +
> include/uapi/linux/bpf.h | 1 +
> kernel/bpf/ringbuf.c | 70 +++++++++++++++++++++++++++++-----
> kernel/bpf/verifier.c | 3 ++
> tools/include/uapi/linux/bpf.h | 1 +
> tools/lib/bpf/libbpf.c | 1 +
> 6 files changed, 68 insertions(+), 9 deletions(-)
>
[...]
>
> -static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma)
> +static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma,
> + bool kernel_producer)
> {
> struct bpf_ringbuf_map *rb_map;
>
> rb_map = container_of(map, struct bpf_ringbuf_map, map);
>
> if (vma->vm_flags & VM_WRITE) {
> - /* allow writable mapping for the consumer_pos only */
> - if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
> + if (kernel_producer) {
> + /* allow writable mapping for the consumer_pos only */
> + if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
> + return -EPERM;
> + /* For user ringbufs, disallow writable mappings to the
> + * consumer pointer, and allow writable mappings to both the
> + * producer position, and the ring buffer data itself.
> + */
> + } else if (vma->vm_pgoff == 0)
> return -EPERM;
the asymmetrical use of {} in one if branch and not using them in
another is extremely confusing, please don't do that
the way you put big comment inside the wrong if branch also throws me
off, maybe move it before return -EPERM instead with proper
indentation?
sorry for nitpicks, but I've been stuck for a few minutes trying to
figure out what exactly is happening here :)
> } else {
> vma->vm_flags &= ~VM_MAYWRITE;
> @@ -242,6 +271,16 @@ static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma)
> vma->vm_pgoff + RINGBUF_PGOFF);
> }
>
> +static int ringbuf_map_mmap_kern(struct bpf_map *map, struct vm_area_struct *vma)
> +{
> + return ringbuf_map_mmap(map, vma, true);
> +}
> +
> +static int ringbuf_map_mmap_user(struct bpf_map *map, struct vm_area_struct *vma)
> +{
> + return ringbuf_map_mmap(map, vma, false);
> +}
I wouldn't mind if you just have two separate implementations of
ringbuf_map_mmap for _kern and _user cases, tbh, probably would be
clearer as well
> +
> static unsigned long ringbuf_avail_data_sz(struct bpf_ringbuf *rb)
> {
> unsigned long cons_pos, prod_pos;
> @@ -269,7 +308,7 @@ const struct bpf_map_ops ringbuf_map_ops = {
> .map_meta_equal = bpf_map_meta_equal,
> .map_alloc = ringbuf_map_alloc,
> .map_free = ringbuf_map_free,
> - .map_mmap = ringbuf_map_mmap,
> + .map_mmap = ringbuf_map_mmap_kern,
> .map_poll = ringbuf_map_poll,
> .map_lookup_elem = ringbuf_map_lookup_elem,
> .map_update_elem = ringbuf_map_update_elem,
> @@ -278,6 +317,19 @@ const struct bpf_map_ops ringbuf_map_ops = {
> .map_btf_id = &ringbuf_map_btf_ids[0],
> };
>
[...]