Re: [PATCH 10/12] mm, slub: remove percpu slabs with CONFIG_SLUB_TINY

From: Baoquan He
Date: Mon Dec 12 2022 - 22:05:57 EST


On 12/12/22 at 05:11am, Dennis Zhou wrote:
> Hello,
>
> On Mon, Dec 12, 2022 at 11:54:28AM +0100, Vlastimil Babka wrote:
> > On 11/27/22 12:05, Hyeonggon Yoo wrote:
> > > On Mon, Nov 21, 2022 at 06:12:00PM +0100, Vlastimil Babka wrote:
> > >> SLUB gets most of its scalability by percpu slabs. However for
> > >> CONFIG_SLUB_TINY the goal is minimal memory overhead, not scalability.
> > >> Thus, #ifdef out the whole kmem_cache_cpu percpu structure and
> > >> associated code. Additionally to the slab page savings, this reduces
> > >> percpu allocator usage, and code size.
> > >
> > > [+Cc Dennis]
> >
> > +To: Baoquan also.

Thanks for adding me.

> >
> > > Wondering if we can reduce (or zero) early reservation of percpu area
> > > when #if !defined(CONFIG_SLUB) || defined(CONFIG_SLUB_TINY)?
> >
> > Good point. I've sent a PR as it was [1], but (if merged) we can still
> > improve that during RC series, if it means more memory saved thanks to less
> > percpu usage with CONFIG_SLUB_TINY.
> >
> > [1]
> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/tag/?h=slab-for-6.2-rc1
>
> The early reservation area not used at boot is then used to serve normal
> percpu allocations. Percpu allocates additional chunks based on a free
> page float count and is backed page by page, not all at once. I get
> slabs is the main motivator of early reservation, but if there are other
> users of percpu, then shrinking the early reservation area is a bit
> moot.

Agree. Before kmem_cache_init() is done, anyone calling alloc_percpu()
can only get allocation done from early reservatoin of percpu area.
So, unless we can make sure nobody need to call alloc_percpu() before
kmem_cache_init() now and future.

The only drawback of early reservation is it's not so flexible. We can
only dynamically create chunk to increase percpu areas when early
reservation is run out, but can't shrink early reservation if system
doesn't need that much.

So we may need weigh the two ideas:
- Not allowing to alloc_percpu() before kmem_cache_init();
- Keep early reservation, and think of a economic value for
CONFIG_SLUB_TINY.

start_kernel()
->setup_per_cpu_areas();
......
->mm_init();
......
-->kmem_cache_init();


__alloc_percpu()
-->pcpu_alloc()
--> succeed to allocate from early reservation
or
-->pcpu_create_chunk()
-->pcpu_alloc_chunk()
-->pcpu_mem_zalloc()