Re: [PATCH bpf-next] bpf: ringbuf: Support consuming BPF_MAP_TYPE_RINGBUF from prog
From: Andrii Nakryiko
Date: Tue Sep 10 2024 - 20:40:23 EST
On Tue, Sep 10, 2024 at 4:44 PM Daniel Xu <dxu@xxxxxxxxx> wrote:
>
> On Tue, Sep 10, 2024 at 03:21:04PM GMT, Andrii Nakryiko wrote:
> > On Tue, Sep 10, 2024 at 3:16 PM Daniel Xu <dxu@xxxxxxxxx> wrote:
> > >
> > >
> > >
> > > On Tue, Sep 10, 2024, at 2:07 PM, Daniel Xu wrote:
> > > > On Tue, Sep 10, 2024 at 01:41:41PM GMT, Andrii Nakryiko wrote:
> > > >> On Tue, Sep 10, 2024 at 11:36 AM Alexei Starovoitov
> > > [...]
> > > >
> > > >>
> > > >> Also, Daniel, can you please make sure that dynptr we return for each
> > > >> sample is read-only? We shouldn't let consumer BPF program ability to
> > > >> corrupt ringbuf record headers (accidentally or otherwise).
> > > >
> > > > Sure.
> > >
> > > So the sample is not read-only. But I think prog is prevented from messing
> > > with header regardless.
> > >
> > > __bpf_user_ringbuf_peek() returns sample past the header:
> > >
> > > *sample = (void *)((uintptr_t)rb->data +
> > > (uintptr_t)((cons_pos + BPF_RINGBUF_HDR_SZ) & rb->mask));
> > >
> > > dynptr is initialized with the above ptr:
> > >
> > > bpf_dynptr_init(&dynptr, sample, BPF_DYNPTR_TYPE_LOCAL, 0, size);
> > >
> > > So I don't think there's a way for the prog to access the header thru the dynptr.
> > >
> >
> > By "header" I mean 8 bytes that precede each submitted ringbuf record.
> > That header is part of ringbuf data area. Given user space can set
> > consumer_pos to arbitrary value, kernel can return arbitrary part of
> > ringbuf data area, including that 8 byte header. If that data is
> > writable, it's easy to screw up that header and crash another BPF
> > program that reserves/submits a new record. User space can only read
> > data area for BPF ringbuf, and so we rely heavily on a tight control
> > of who can write what into those 8 bytes.
>
> Ah, ok. I think I understand.
>
> Given this and your other comments about rb->busy, what about enforcing
> bpf_user_ringbuf_drain() NAND mmap? I think the use cases here are
> different enough where this makes sense.
You mean disabling user-space mmap()? TBH, I'd like to understand the
use case first before we make such decisions. Maybe what you need is
not really a BPF ringbuf? Can you give us a bit more details on what
you are trying to achieve?