[PATCH v7 bpf-next 00/14] bpf: cgroup local storage

From: Roman Gushchin
Date: Thu Aug 02 2018 - 18:01:46 EST

This patchset implements cgroup local storage for bpf programs.
The main idea is to provide a fast accessible memory for storing
various per-cgroup data, e.g. number of transmitted packets.

Cgroup local storage looks as a special type of map for userspace,
and is accessible using generic bpf maps API for reading and
updating of the data. The (cgroup inode id, attachment type) pair
is used as a map key.

A user can't create new entries or destroy existing entries;
it happens automatically when a user attaches/detaches a bpf program
to a cgroup.

>From a bpf program's point of view, cgroup storage is accessible
without lookup using the special get_local_storage() helper function.
It takes a map fd as an argument. It always returns a valid pointer
to the corresponding memory area.

To implement such a lookup-free access a pointer to the cgroup
storage is saved for an attachment of a bpf program to a cgroup,
if required by the program. Before running the program, it's saved
in a special global per-cpu variable, which is accessible from the
get_local_storage() helper.

This patchset implement only cgroup local storage, however the API
is intentionally made extensible to support other local storage types
further: e.g. thread local storage, socket local storage, etc.

- fixed a use-after-free bug, caused by not clearing
prog->aux->cgroup_storage pointer after releasing the map

- fixed an error with returning -EINVAL instead of a pointer

- fixed an issue in verifier (test that flags == 0 properly)
- added a corresponding test
- added a note about synchronization, sync docs to tools/uapi/...
- switched the cgroup test to use XADD
- added a check for attr->max_entries to be 0, and atter->max_flags
to be sane
- use bpf_uncharge_memlock() in bpf_uncharge_memlock()
- rebased to bpf-next

- fixed a leak in cgroup attachment code (discovered by Daniel)
- cgroup storage map will be released if the corresponding
bpf program failed to load by any reason
- introduced bpf_uncharge_memlock() helper

- fixed more build and sparse issues
- rebased to bpf-next

- fixed build issues
- removed explicit rlimit calls in patch 14
- rebased to bpf-next

Signed-off-by: Roman Gushchin <guro@xxxxxx>
Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
Cc: Martin KaFai Lau <kafai@xxxxxx>

Roman Gushchin (14):
bpf: add ability to charge bpf maps memory dynamically
bpf: introduce cgroup storage maps
bpf: pass a pointer to a cgroup storage using pcpu variable
bpf: allocate cgroup storage entries on attaching bpf programs
bpf: extend bpf_prog_array to store pointers to the cgroup storage
bpf/verifier: introduce BPF_PTR_TO_MAP_VALUE
bpf: don't allow create maps of cgroup local storages
bpf: introduce the bpf_get_local_storage() helper function
bpf: sync bpf.h to tools/
bpftool: add support for CGROUP_STORAGE maps
bpf/test_run: support cgroup local storage
selftests/bpf: add verifier cgroup storage tests
selftests/bpf: add a cgroup storage test
samples/bpf: extend test_cgrp2_attach2 test to use cgroup storage

drivers/media/rc/bpf-lirc.c | 10 +-
include/linux/bpf-cgroup.h | 54 ++++
include/linux/bpf.h | 25 +-
include/linux/bpf_types.h | 3 +
include/uapi/linux/bpf.h | 27 +-
kernel/bpf/Makefile | 1 +
kernel/bpf/cgroup.c | 58 +++-
kernel/bpf/core.c | 77 ++---
kernel/bpf/helpers.c | 20 ++
kernel/bpf/local_storage.c | 378 ++++++++++++++++++++++
kernel/bpf/map_in_map.c | 3 +-
kernel/bpf/syscall.c | 61 +++-
kernel/bpf/verifier.c | 38 ++-
net/bpf/test_run.c | 13 +-
net/core/filter.c | 23 +-
samples/bpf/test_cgrp2_attach2.c | 21 +-
tools/bpf/bpftool/map.c | 1 +
tools/include/uapi/linux/bpf.h | 27 +-
tools/testing/selftests/bpf/Makefile | 3 +-
tools/testing/selftests/bpf/bpf_helpers.h | 2 +
tools/testing/selftests/bpf/test_cgroup_storage.c | 130 ++++++++
tools/testing/selftests/bpf/test_verifier.c | 140 +++++++-
22 files changed, 1029 insertions(+), 86 deletions(-)
create mode 100644 kernel/bpf/local_storage.c
create mode 100644 tools/testing/selftests/bpf/test_cgroup_storage.c