Re: [RFC PATCH bpf-next 00/13] bpf: Introduce BPF namespace

From: Yafang Shao
Date: Thu Apr 06 2023 - 01:44:49 EST


On Thu, Apr 6, 2023 at 12:24 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Wed, Apr 5, 2023 at 8:22 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> >
> > On Thu, Apr 6, 2023 at 11:06 AM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 7:55 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > > >
> > > > It seems that I didn't describe the issue clearly.
> > > > The container doesn't have CAP_SYS_ADMIN, but the CAP_SYS_ADMIN is
> > > > required to run bpftool, so the bpftool running in the container
> > > > can't get the ID of bpf objects or convert IDs to FDs.
> > > > Is there something that I missed ?
> > >
> > > Nothing. This is by design. bpftool needs sudo. That's all.
> > >
> >
> > Hmm, what I'm trying to do is make bpftool run without sudo.
>
> This is not a task that is worth solving.
>

Then the container with CAP_BPF enabled can't even iterate its bpf progs ...

> > > > Some questions,
> > > > - What if the process exits after attaching the bpf prog and the prog
> > > > is not auto-detachable?
> > > > For example, the reuserport bpf prog is not auto-detachable. After
> > > > pins the reuserport bpf prog, a task can attach it through the pinned
> > > > bpf file, but if the task forgets to detach it and the pinned file is
> > > > removed, then it seems there's no way to figure out which task or
> > > > cgroup this prog belongs to...
> > >
> > > you're saying that there is a bpf prog in the kernel without
> > > corresponding user space ?
> >
> > No, it is corresponding to user space. For example, it may be
> > corresponding to a socket fd, or a cgroup fd.
> >
> > > Meaning no user space process has an FD
> > > that points to this prog or FD to a map that this prog is using?
> > > In such a case this is truly kernel bpf prog. It doesn't belong to cgroup.
> > >
> >
> > Even if it is kernel bpf prog, it is created by a process. The user
> > needs to know which one created it.
>
> In some situations it's certainly interesting to know which process
> loaded a particular program.
> In many other situations it's irrelevant.
> For example, the process that loaded a prog could have been moved to a
> different cgroup.
> If you want to track the loading you need to install bpf_lsm
> that monitors prog_load hook and collect that info.
> It's not the job of the kernel to do it.
>

Agreed with you that we can add lots of hooks to track every detail of
the operations.
But it is not free. More hooks, more overhead.
If we can change the kernel to make it lightweight, why not...

> > > > - Could you pls. explain in detail how to get comm, pid, or cgroup
> > > > from a pinned bpffs file?
> > >
> > > pinned bpf prog and no user space holds FD to it?
> > > It's not part of any cgroup. Nothing to print.
> >
> > As I explained above, even if it holds nothing, the user needs to know
> > the information from it. For example, if it is expected, which one
> > created it?
>
> See the answer above. The kernel has enough hooks already to provide
> this information to user space. No kernel changes necessary.



--
Regards
Yafang