Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects

From: Michal Hocko
Date: Tue Sep 04 2018 - 14:07:12 EST


[now CC Vladimir for real]

On Tue 04-09-18 20:06:31, Michal Hocko wrote:
> On Tue 04-09-18 10:52:46, Roman Gushchin wrote:
> > On Tue, Sep 04, 2018 at 06:14:31PM +0200, Michal Hocko wrote:
> [...]
> > > I am not opposing your patch but I am trying to figure out whether that
> > > is the best approach.
> >
> > I don't think the current logic does make sense. Why should cgroups
> > with less than 4k kernel objects be excluded from being scanned?
>
> How is it any different from the the LRU reclaim? Maybe it is just not
> that visible because there usually more pages there. But in principle it
> is the same issue AFAICS.
>
> > Reparenting of all pages is definitely an option to consider,
> > but it's not free in any case, so if there is no problem,
> > why should we? Let's keep it as a last measure. In my case,
> > the proposed patch works perfectly: the number of dying cgroups
> > jumps around 100, where it grew steadily to 2k and more before.
>
> Let me emphasise that I am not opposing the patch. I just think that we
> have made some decisions which are not ideal but I would really like to
> prevent from building workarounds on top. If we have to reconsider some
> of those decisions then let's do it. Maybe the priority scaling is just
> too coarse and what seem to work work for normal LRUs doesn't work for
> shrinkers.
>
> > I believe that reparenting of LRU lists is required to minimize
> > the number of LRU lists to scan, but I'm not sure.
>
> Well, we do have more lists to scan for normal LRUs. It is true that
> shrinkers add multiplining factor to that but in principle I guess we
> really want to distinguish dead memcgs because we do want to reclaim
> those much more than the rest. Those objects are basically of no use
> just eating resources. The pagecache has some chance to be reused at
> least but I fail to see why we should keep kernel objects around. Sure,
> some of them might be harder to reclaim due to different life time and
> internal object management but this doesn't change the fact that we
> should try hard to reclaim those. So my gut feeling tells me that we
> should have a way to distinguish them.
>
> Btw. I do not see Vladimir on the CC list. Added (the thread starts
> here http://lkml.kernel.org/r/20180831203450.2536-1-guro@xxxxxx)

--
Michal Hocko
SUSE Labs