Re: [v8 0/4] cgroup-aware OOM killer

From: Michal Hocko
Date: Mon Sep 18 2017 - 02:14:19 EST


On Fri 15-09-17 08:23:01, Roman Gushchin wrote:
> On Fri, Sep 15, 2017 at 12:58:26PM +0200, Michal Hocko wrote:
> > On Thu 14-09-17 09:05:48, Roman Gushchin wrote:
> > > On Thu, Sep 14, 2017 at 03:40:14PM +0200, Michal Hocko wrote:
> > > > On Wed 13-09-17 14:56:07, Roman Gushchin wrote:
> > > > > On Wed, Sep 13, 2017 at 02:29:14PM +0200, Michal Hocko wrote:
> > > > [...]
> > > > > > I strongly believe that comparing only leaf memcgs
> > > > > > is more straightforward and it doesn't lead to unexpected results as
> > > > > > mentioned before (kill a small memcg which is a part of the larger
> > > > > > sub-hierarchy).
> > > > >
> > > > > One of two main goals of this patchset is to introduce cgroup-level
> > > > > fairness: bigger cgroups should be affected more than smaller,
> > > > > despite the size of tasks inside. I believe the same principle
> > > > > should be used for cgroups.
> > > >
> > > > Yes bigger cgroups should be preferred but I fail to see why bigger
> > > > hierarchies should be considered as well if they are not kill-all. And
> > > > whether non-leaf memcgs should allow kill-all is not entirely clear to
> > > > me. What would be the usecase?
> > >
> > > We definitely want to support kill-all for non-leaf cgroups.
> > > A workload can consist of several cgroups and we want to clean up
> > > the whole thing on OOM.
> >
> > Could you be more specific about such a workload? E.g. how can be such a
> > hierarchy handled consistently when its sub-tree gets killed due to
> > internal memory pressure?
>
> Or just system-wide OOM.
>
> > Or do you expect that none of the subtree will
> > have hard limit configured?
>
> And this can also be a case: the whole workload may have hard limit
> configured, while internal memcgs have only memory.low set for "soft"
> prioritization.
>
> >
> > But then you just enforce a structural restriction on your configuration
> > because
> > root
> > / \
> > A D
> > /\
> > B C
> >
> > is a different thing than
> > root
> > / | \
> > B C D
> >
>
> I actually don't have a strong argument against an approach to select
> largest leaf or kill-all-set memcg. I think, in practice there will be
> no much difference.

Well, I am worried that the difference will come unexpected when a
deeper hierarchy is needed because of the structural needs.

--
Michal Hocko
SUSE Labs