Re: SLUB: percpu partial object count is highly inaccurate, causing some memory wastage and maybe also worse tail latencies?

From: Michal Hocko
Date: Mon Jan 18 2021 - 11:09:57 EST

Next message: Russell King - ARM Linux admin: "Re: [PATCH] ARM: kernel: Fix interrupted SMC calls"
Previous message: Daniel Lezcano: "Re: [PATCH] thermal: power allocator: Add control for non-power actor devices"
In reply to: Christoph Lameter: "Re: SLUB: percpu partial object count is highly inaccurate, causing some memory wastage and maybe also worse tail latencies?"
Next in thread: Vlastimil Babka: "Re: SLUB: percpu partial object count is highly inaccurate, causing some memory wastage and maybe also worse tail latencies?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon 18-01-21 15:46:43, Cristopher Lameter wrote:
> On Mon, 18 Jan 2021, Michal Hocko wrote:
>
> > > Hm this would be similar to recommending a periodical echo > drop_caches
> > > operation. We actually discourage from that (and yeah, some tools do that, and
> > > we now report those in dmesg). I believe the kernel should respond to memory
> > > pressure and not OOM prematurely by itself, including SLUB.
> >
> > Absolutely agreed! Partial caches are a very deep internal
> > implementation detail of the allocator and admin has no bussiness into
> > fiddling with that. This would only lead to more harm than good.
> > Comparision to drop_caches is really exact!
>
> Really? The maximum allocation here has a upper boundary that depends on
> the number of possible partial per cpu slabs.

And number of cpus and caches...

> There is a worst case
> scenario that is not nice and wastes some memory but it is not an OOM
> situation and the system easily recovers from it.

There is no pro-active shrinking of those when we are close to the OOM
so we still can go and kill a task while there is quite some memory
sitting in a freeable slub caches unless I am missing something.

We have learned about this in a memcg environment on our distribution
kernels where the problem is amplified by the use in memcgs with a small
limit. This is an older kernel and I would expect the current upstream
will behave better with Roman's accounting rework. But still it would be
great if the allocator could manage its caches depending on the memory
demand.

> The slab shrinking is not needed but if you are concerned about reclaiming
> more memory right now then I guess you may want to run the slab shrink
> operation.

Yes, you can do that. In a same way you can shrink the page cache.
Moreover it is really hard to do that somehow inteligently because you
would need to watch the system very closely in order to shrink when it
is really needed. That requires a deep understanding of the allocator.

> Dropping the page cache is bad? Well sometimes you want more free memory
> due to a certain operation that needs to be started and where you do not
> want the overhead of page cache processing.

It is not bad if used properly. My experience is that people have
developed instinct to drop caches whenever something is not quite right
because Internet has recommended that.

--
Michal Hocko
SUSE Labs

Next message: Russell King - ARM Linux admin: "Re: [PATCH] ARM: kernel: Fix interrupted SMC calls"
Previous message: Daniel Lezcano: "Re: [PATCH] thermal: power allocator: Add control for non-power actor devices"
In reply to: Christoph Lameter: "Re: SLUB: percpu partial object count is highly inaccurate, causing some memory wastage and maybe also worse tail latencies?"
Next in thread: Vlastimil Babka: "Re: SLUB: percpu partial object count is highly inaccurate, causing some memory wastage and maybe also worse tail latencies?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]