Re: [PATCH v3 bpf-next 02/14] bpf: introduce cgroup storage maps

From: Roman Gushchin
Date: Fri Jul 27 2018 - 13:13:19 EST


On Fri, Jul 27, 2018 at 06:11:31AM +0200, Daniel Borkmann wrote:
> On 07/20/2018 07:45 PM, Roman Gushchin wrote:
> > This commit introduces BPF_MAP_TYPE_CGROUP_STORAGE maps:
> > a special type of maps which are implementing the cgroup storage.
> >
> > From the userspace point of view it's almost a generic
> > hash map with the (cgroup inode id, attachment type) pair
> > used as a key.
> >
> > The only difference is that some operations are restricted:
> > 1) a user can't create new entries,
> > 2) a user can't remove existing entries.
> >
> > The lookup from userspace is o(log(n)).
> >
> > Signed-off-by: Roman Gushchin <guro@xxxxxx>
> > Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > Acked-by: Martin KaFai Lau <kafai@xxxxxx>
>
> (First of all sorry for the late review, only limited availability this week
> on my side.)

Np, thank you for the review!

>
> > ---
> > include/linux/bpf-cgroup.h | 38 +++++
> > include/linux/bpf.h | 1 +
> > include/linux/bpf_types.h | 3 +
> > include/uapi/linux/bpf.h | 6 +
> > kernel/bpf/Makefile | 1 +
> > kernel/bpf/local_storage.c | 367 +++++++++++++++++++++++++++++++++++++++++++++
> > kernel/bpf/verifier.c | 12 ++
> > 7 files changed, 428 insertions(+)
> > create mode 100644 kernel/bpf/local_storage.c
> >
> > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> > index 79795c5fa7c3..6b0e7bd4b154 100644
> > --- a/include/linux/bpf-cgroup.h
> > +++ b/include/linux/bpf-cgroup.h
> > @@ -3,19 +3,39 @@
> > #define _BPF_CGROUP_H
> >
> > #include <linux/jump_label.h>
> > +#include <linux/rbtree.h>
> > #include <uapi/linux/bpf.h>
> >
> [...]
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 15d69b278277..0b089ba4595d 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5140,6 +5140,14 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
> > return -E2BIG;
> > }
> >
> > + if (map->map_type == BPF_MAP_TYPE_CGROUP_STORAGE &&
> > + bpf_cgroup_storage_assign(env->prog, map)) {
> > + verbose(env,
> > + "only one cgroup storage is allowed\n");
> > + fdput(f);
> > + return -EBUSY;
> > + }
> > +
> > /* hold the map. If the program is rejected by verifier,
> > * the map will be released by release_maps() or it
> > * will be used by the valid program until it's unloaded
> > @@ -5148,6 +5156,10 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
> > map = bpf_map_inc(map, false);
> > if (IS_ERR(map)) {
> > fdput(f);
> > + if (map->map_type ==
> > + BPF_MAP_TYPE_CGROUP_STORAGE)
> > + bpf_cgroup_storage_release(env->prog,
> > + map);
>
> I think this behavior is a bit strange, meaning that we would reset the map via
> bpf_cgroup_storage_release() in this case, but if we error out and exit in any
> later instruction the prior bpf_cgroup_storage_assign() is not undone, meaning
> at this point we have no other choice but to destroy the map since any later
> BPF prog load with bpf_cgroup_storage_assign() attempt would fail with -EBUSY
> even though it's not assigned anywhere, is that correct? Same also on any other
> errors along the prog load path. E.g. say, as one example, your verifier buffer
> is too small, so any retry with the very same program from loader side would fail
> above due to different prog pointers?

Yeah, I see...
We should call bpf_cgroup_storage_release() from the generic verifier error path.
I'll fix this and the leak in the other patch and resend soon.

Thanks!