Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

From: Roman Gushchin
Date: Mon Sep 11 2017 - 08:50:56 EST

On Mon, Sep 11, 2017 at 11:05:59AM +0200, Michal Hocko wrote:
> On Thu 07-09-17 12:14:57, Johannes Weiner wrote:
> > On Wed, Sep 06, 2017 at 10:28:59AM +0200, Michal Hocko wrote:
> > > On Tue 05-09-17 17:53:44, Johannes Weiner wrote:
> > > > The cgroup-awareness in the OOM killer is exactly the same thing. It
> > > > should have been the default from the beginning, because the user
> > > > configures a group of tasks to be an interdependent, terminal unit of
> > > > memory consumption, and it's undesirable for the OOM killer to ignore
> > > > this intention and compare members across these boundaries.
> > >
> > > I would agree if that was true in general. I can completely see how the
> > > cgroup awareness is useful in e.g. containerized environments (especially
> > > with kill-all enabled) but memcgs are used in a large variety of
> > > usecases and I cannot really say all of them really demand the new
> > > semantic. Say I have a workload which doesn't want to see reclaim
> > > interference from others on the same machine. Why should I kill a
> > > process from that particular memcg just because it is the largest one
> > > when there is a memory hog/leak outside of this memcg?
> >
> > Sure, it's always possible to come up with a config for which this
> > isn't the optimal behavior. But this is about picking a default that
> > makes sense to most users, and that type of cgroup usage just isn't
> > the common case.
> How can you tell, really? Even if cgroup2 is a new interface we still
> want as many legacy (v1) users to be migrated to the new hierarchy.
> I have seen quite different usecases over time and I have hard time to
> tell which of them to call common enough.
> > > From my point of view the safest (in a sense of the least surprise)
> > > way to go with opt-in for the new heuristic. I am pretty sure all who
> > > would benefit from the new behavior will enable it while others will not
> > > regress in unexpected way.
> >
> > This thinking simply needs to be balanced against the need to make an
> > unsurprising and consistent final interface.
> Sure. And I _think_ we can come up with a clear interface to configure
> the oom behavior - e.g. a kernel command line parameter with a default
> based on a config option.

I would say cgroup v2 mount option is better, because it allows to change
the behavior dynamically (without rebooting) and clearly reflects
cgroup v2 dependency.

Also, it makes systemd (or who is mounting cgroupfs) responsible for the
default behavior. And makes more or less not important what the default is.