Re: [RFC PATCH bpf-next 00/13] bpf: Introduce BPF namespace

From: Andrii Nakryiko
Date: Fri Apr 07 2023 - 11:59:37 EST


On Thu, Apr 6, 2023 at 6:44 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Thu, Apr 06, 2023 at 01:22:26PM -0700, Andrii Nakryiko wrote:
> > On Wed, Apr 5, 2023 at 10:44 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > >
> > > On Thu, Apr 6, 2023 at 12:24 PM Alexei Starovoitov
> > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > >
> > > > On Wed, Apr 5, 2023 at 8:22 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Apr 6, 2023 at 11:06 AM Alexei Starovoitov
> > > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > > >
> > > > > > On Wed, Apr 5, 2023 at 7:55 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > It seems that I didn't describe the issue clearly.
> > > > > > > The container doesn't have CAP_SYS_ADMIN, but the CAP_SYS_ADMIN is
> > > > > > > required to run bpftool, so the bpftool running in the container
> > > > > > > can't get the ID of bpf objects or convert IDs to FDs.
> > > > > > > Is there something that I missed ?
> > > > > >
> > > > > > Nothing. This is by design. bpftool needs sudo. That's all.
> > > > > >
> > > > >
> > > > > Hmm, what I'm trying to do is make bpftool run without sudo.
> > > >
> > > > This is not a task that is worth solving.
> > > >
> > >
> > > Then the container with CAP_BPF enabled can't even iterate its bpf progs ...
> >
> > I'll leave the BPF namespace discussion aside (I agree that it needs
> > way more thought).
> >
> > I am a bit surprised that we require CAP_SYS_ADMIN for GET_NEXT_ID
> > operations. GET_FD_BY_ID is definitely CAP_SYS_ADMIN, as they allow
> > you to take over someone else's link and stuff like this. But just
> > iterating IDs seems like a pretty innocent functionality, so maybe we
> > should remove CAP_SYS_ADMIN for GET_NEXT_ID?
> >
> > By itself GET_NEXT_ID is relatively useless without capabilities, but
> > we've been floating the idea of providing GET_INFO_BY_ID (not by FD)
> > for a while now, and that seems useful in itself, as it would indeed
> > help tools like bpftool to get *some* information even without
> > privileges. Whether those GET_INFO_BY_ID operations should return same
> > full bpf_{prog,map,link,btf}_info or some trimmed down version of them
> > would be up to discussion, but I think getting some info without
> > creating an FD seems useful in itself.
> >
> > Would it be worth discussing and solving this separately from
> > namespacing issues?
>
> Iteration of IDs itself is fine. The set of IDs is not security sensitive,
> but GET_NEXT_BY_ID has to be carefully restricted.
> It returns xlated, jited, BTF, line info, etc
> and with all the restrictions it would need something like
> CAP_SYS_PTRACE and CAP_PERFMON to be useful.
> And with that we're not far from CAP_SYS_ADMIN.
> Why bother then?

You probably meant that GET_INFO_BY_ID should be carefully restricted?
So yeah, that's what I said that this would have to be discussed
further. I agree that returning func/line info, program dump, etc is
probably a privileged part. But there is plenty of useful info besides
that (e.g., prog name, insns cnt, run stats, etc) that would be useful
for unpriv applications to monitor their own apps that they opened
from BPF FS, or just some observability daemons.

There is a lot of useful information in bpf_map_info and bpf_link_info
that's way less privileged. I think bpf_link_info is good as is. Same
for bpf_map_info.

Either way, I'm not insisting, just something that seems pretty simple
to add and useful in some scenarios. We can reuse existing code and
types for GET_INFO_BY_FD and just zero-out (or prevent filling out)
those privileged fields you mentioned. Anyway, something to put on the
backburner, perhaps.