Re: [PATCH bpf-next v4 03/30] bpf: memcg-based memory accounting for bpf maps
From: Roman Gushchin
Date: Tue Aug 25 2020 - 22:38:32 EST
On Tue, Aug 25, 2020 at 04:27:09PM -0700, Shakeel Butt wrote:
> On Fri, Aug 21, 2020 at 8:01 AM Roman Gushchin <guro@xxxxxx> wrote:
> >
> > This patch enables memcg-based memory accounting for memory allocated
> > by __bpf_map_area_alloc(), which is used by most map types for
> > large allocations.
> >
> > If a map is updated from an interrupt context, and the update
> > results in memory allocation, the memory cgroup can't be determined
> > from the context of the current process. To address this case,
> > bpf map preserves a pointer to the memory cgroup of the process,
> > which created the map. This memory cgroup is charged for allocations
> > from interrupt context.
> >
> > Following patches in the series will refine the accounting for
> > some map types.
> >
> > Signed-off-by: Roman Gushchin <guro@xxxxxx>
> > ---
> > include/linux/bpf.h | 4 ++++
> > kernel/bpf/helpers.c | 37 ++++++++++++++++++++++++++++++++++++-
> > kernel/bpf/syscall.c | 27 ++++++++++++++++++++++++++-
> > 3 files changed, 66 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index a9b7185a6b37..b5f178afde94 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -34,6 +34,7 @@ struct btf_type;
> > struct exception_table_entry;
> > struct seq_operations;
> > struct bpf_iter_aux_info;
> > +struct mem_cgroup;
> >
> > extern struct idr btf_idr;
> > extern spinlock_t btf_idr_lock;
> > @@ -138,6 +139,9 @@ struct bpf_map {
> > u32 btf_value_type_id;
> > struct btf *btf;
> > struct bpf_map_memory memory;
> > +#ifdef CONFIG_MEMCG_KMEM
> > + struct mem_cgroup *memcg;
> > +#endif
> > char name[BPF_OBJ_NAME_LEN];
> > u32 btf_vmlinux_value_type_id;
> > bool bypass_spec_v1;
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index be43ab3e619f..f8ce7bc7003f 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -14,6 +14,7 @@
> > #include <linux/jiffies.h>
> > #include <linux/pid_namespace.h>
> > #include <linux/proc_ns.h>
> > +#include <linux/sched/mm.h>
> >
> > #include "../../lib/kstrtox.h"
> >
> > @@ -41,11 +42,45 @@ const struct bpf_func_proto bpf_map_lookup_elem_proto = {
> > .arg2_type = ARG_PTR_TO_MAP_KEY,
> > };
> >
> > +#ifdef CONFIG_MEMCG_KMEM
> > +static __always_inline int __bpf_map_update_elem(struct bpf_map *map, void *key,
> > + void *value, u64 flags)
> > +{
> > + struct mem_cgroup *old_memcg;
> > + bool in_interrupt;
> > + int ret;
> > +
> > + /*
> > + * If update from an interrupt context results in a memory allocation,
> > + * the memory cgroup to charge can't be determined from the context
> > + * of the current task. Instead, we charge the memory cgroup, which
> > + * contained a process created the map.
> > + */
> > + in_interrupt = in_interrupt();
> > + if (in_interrupt)
> > + old_memcg = memalloc_use_memcg(map->memcg);
> > +
>
> The memcg_kmem_bypass() will bypass all __GFP_ACCOUNT allocations even
> before looking at current->active_memcg, so, this patch will be a
> noop.
Good point. Looks like it's a good example of kmem accounting from an interrupt
context, which we've discussed on the Plumbers session.
It means we need some more work on the mm side.
Thanks!