Re: [PATCH v3 17/19] mm: memcg/slab: use a single set of kmem_caches for all allocations

From: Jesper Dangaard Brouer
Date: Wed May 27 2020 - 04:36:01 EST


On Tue, 26 May 2020 16:55:05 +0200
Vlastimil Babka <vbabka@xxxxxxx> wrote:

> On 4/22/20 10:47 PM, Roman Gushchin wrote:
> > Instead of having two sets of kmem_caches: one for system-wide and
> > non-accounted allocations and the second one shared by all accounted
> > allocations, we can use just one.
> >
> > The idea is simple: space for obj_cgroup metadata can be allocated
> > on demand and filled only for accounted allocations.
> >
> > It allows to remove a bunch of code which is required to handle
> > kmem_cache clones for accounted allocations. There is no more need
> > to create them, accumulate statistics, propagate attributes, etc.
> > It's a quite significant simplification.
> >
> > Also, because the total number of slab_caches is reduced almost twice
> > (not all kmem_caches have a memcg clone), some additional memory
> > savings are expected. On my devvm it additionally saves about 3.5%
> > of slab memory.
> >
> > Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > Signed-off-by: Roman Gushchin <guro@xxxxxx>
>
> Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx>
>
> However, as this series will affect slab fastpaths, and perhaps
> especially this patch will affect even non-kmemcg allocations being
> freed, I'm CCing Jesper and Mel for awareness as they AFAIK did work
> on network stack memory management performance, and perhaps some
> benchmarks are in order...

Thanks for the heads-up!

We (should) all know Mel Gorman's tests, which is here[1]:
[1] https://github.com/gormanm/mmtests

My guess is that these change will only be visible with micro
benchmarks of the slub/slab. I my slab/slub micro benchmarks are
located here [2] https://github.com/netoptimizer/prototype-kernel/

It is kernel modules that is compiled against your devel tree and pushed
to the remote host. Results are simply printk'ed in dmesg.
Usage compile+push commands documented here[3]:
[3] https://prototype-kernel.readthedocs.io/en/latest/prototype-kernel/build-process.html

I recommend trying: "slab_bulk_test01"
modprobe slab_bulk_test01; rmmod slab_bulk_test01
dmesg

Result from these kernel module benchmarks are included in some
commits[4][5]. And in [4] I found some overhead caused by MEMCG.

[4] https://git.kernel.org/torvalds/c/ca257195511d
[5] https://git.kernel.org/torvalds/c/fbd02630c6e3
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer