Re: [v8 3/4] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

From: Roman Gushchin
Date: Tue Sep 12 2017 - 16:01:50 EST


On Mon, Sep 11, 2017 at 01:48:39PM -0700, David Rientjes wrote:
> On Mon, 11 Sep 2017, Roman Gushchin wrote:
>
> > Add a "groupoom" cgroup v2 mount option to enable the cgroup-aware
> > OOM killer. If not set, the OOM selection is performed in
> > a "traditional" per-process way.
> >
> > The behavior can be changed dynamically by remounting the cgroupfs.
>
> I can't imagine that Tejun would be happy with a new mount option,
> especially when it's not required.
>
> OOM behavior does not need to be defined at mount time and for the entire
> hierarchy. It's possible to very easily implement a tunable as part of
> mem cgroup that is propagated to descendants and controls the oom scoring
> behavior for that hierarchy. It does not need to be system wide and
> affect scoring of all processes based on which mem cgroup they are
> attached to at any given time.

No, I don't think that mixing per-cgroup and per-process OOM selection
algorithms is a good idea.

So, there are 3 reasonable options:
1) boot option
2) sysctl
3) cgroup mount option

I believe, 3) is better, because it allows changing the behavior dynamically,
and explicitly depends on v2 (what sysctl lacks).

So, the only question is should it be opt-in or opt-out option.
Personally, I would prefer opt-out, but Michal has a very strong opinion here.

Thanks!