Re: [RFC] mm/vmscan: add periodic slab shrinker

From: Dave Chinner
Date: Wed Apr 06 2022 - 01:56:51 EST

On Tue, Apr 05, 2022 at 02:31:02PM -0700, Roman Gushchin wrote:
> On Tue, Apr 05, 2022 at 01:58:59PM -0700, Yang Shi wrote:
> > On Tue, Apr 5, 2022 at 9:36 AM Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote:
> > > On Tue, Apr 05, 2022 at 03:17:10PM +1000, Dave Chinner wrote:
> > > > On Mon, Apr 04, 2022 at 12:08:25PM -0700, Roman Gushchin wrote:
> > > > > On Mon, Apr 04, 2022 at 11:09:48AM +1000, Dave Chinner wrote:
> > IMHO
> > the number of really freed pages should be returned (I do understand
> > it is not that easy for now), and returning 0 should be fine.
> It's doable, there is already a mechanism in place which hooks into
> the slub/slab/slob release path and stops the slab reclaim as a whole
> if enough memory was freed.

The reclaim state that accounts for slab pages freed really
needs to be first class shrinker state that is aggregated at the
do_shrink_slab() level and passed back to the vmscan code. The
shrinker infrastructure itself should be aware of the progress each
shrinker is making - not just objects reclaimed but also pages
reclaimed - so it can make better decisions about how much work
should be done by each shrinker.

e.g. lots of objects in cache, lots of objects reclaimed, no pages
reclaimed is indicative of a fragmented slab cache. If this keeps
happening, we should be trying to apply extra pressure to this
specific cache because the only method we have for correcting a
fragmented cache to return some memory is to reclaim lots more
objects from it.

> > The
> > current logic (returning the number of objects) may feed up something
> > over-optimistic. I, at least, experienced once or twice that a
> > significant amount of slab caches were shrunk, but actually 0 pages
> > were freed actually. TBH the new slab controller may make it worse
> > since the page may be pinned by the objects from other memcgs.
> Of course, the more dense the placement of objects is, the harder is to get
> the physical pages back. But usually it pays off by having a dramatically
> lower total number of slab pages.

Unless you have tens of millions of objects in the cache. The dentry
cache is a prime example of this "lots of tiny cached objects" where
we have tens of objects per slab page and so can suffer badly from
internal fragmentation....


Dave Chinner