Re: [PATCH 00/16] The new slab memory controller
From: Roman Gushchin
Date: Tue Dec 10 2019 - 13:05:34 EST
On Tue, Dec 10, 2019 at 11:53:08AM +0530, Bharata B Rao wrote:
> On Mon, Dec 09, 2019 at 06:04:22PM +0000, Roman Gushchin wrote:
> > On Mon, Dec 09, 2019 at 05:26:49PM +0530, Bharata B Rao wrote:
> > > On Mon, Dec 09, 2019 at 02:47:52PM +0530, Bharata B Rao wrote:
> > Hello, Bharata!
> >
> > Thank you very much for the report and the patch, it's a good catch,
> > and the code looks good to me. I'll include the fix into the next
> > version of the patchset (I can't keep it as a separate fix due to massive
> > renamings/rewrites).
>
> Sure, but please note that I did post only the core change without
> the associated header includes etc, where I took some short cuts.
Sure, I'll adopt the code to the next version.
>
> >
> > >
> > > But that still doesn't explain why we don't hit this problem on x86.
> >
> > On x86 (and arm64) we're using vmap-based stacks, so the underlying memory is
> > allocated directly by the page allocator, bypassing the slab allocator.
> > It depends on CONFIG_VMAP_STACK.
>
> I turned off CONFIG_VMAP_STACK on x86, but still don't hit any
> problems.
If you'll look at kernel/fork.c (~ :184), there are two ORed conditions
to bypass the slab allocator:
1) THREAD_SIZE >= PAGE_SIZE
2) CONFIG_VMAP_STACK
I guess the first one is what saves x86 in your case, while on ppc you might
have 64k pages (hard to say without looking at your config).
>
> $ grep VMAP .config
> CONFIG_HAVE_ARCH_HUGE_VMAP=y
> CONFIG_HAVE_ARCH_VMAP_STACK=y
> # CONFIG_VMAP_STACK is not set
>
> May be something else prevents this particular crash on x86?
I'm pretty sure it will crash, have stack been allocated using
the slab allocator. I bet everybody are using vmap-based stacks.
>
> >
> > Btw, thank you for looking into the patchset and trying it on powerpc.
> > Would you mind to share some results?
>
> Sure, I will get back with more results, but initial numbers when running
> a small alpine docker image look promising.
>
> With slab patches
> # docker stats --no-stream
> CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
> 24bc99d94d91 sleek 0.00% 1MiB / 25MiB 4.00% 1.81kB / 0B 0B / 0B 0
>
> Without slab patches
> # docker stats --no-stream
> CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
> 52382f8aaa13 sleek 0.00% 8.688MiB / 25MiB 34.75% 1.53kB / 0B 0B / 0B 0
>
> So that's an improvement of MEM USAGE from 8.688MiB to 1MiB. Note that this
> docker container isn't doing anything useful and hence the numbers
> aren't representative of any workload.
Cool, that's great!
Small containers is where the relative win is the biggest. Of course, it will
decrease with the size of containers, but it's expected.
If you'll get any additional numbers, please, share them. It's really
interesting, especially if you have larger-than-4k pages.
Thank you!
Roman