Re: [PATCH 2/2] mm: Consider subtrees in memory.events

From: Johannes Weiner
Date: Wed Jan 30 2019 - 14:23:50 EST


On Mon, Jan 28, 2019 at 01:51:51PM +0100, Michal Hocko wrote:
> On Fri 25-01-19 10:28:08, Tejun Heo wrote:
> > On Fri, Jan 25, 2019 at 06:37:13PM +0100, Michal Hocko wrote:
> > > Please note that I understand that this might be confusing with the rest
> > > of the cgroup APIs but considering that this is the first time somebody
> > > is actually complaining and the interface is "production ready" for more
> > > than three years I am not really sure the situation is all that bad.
> >
> > cgroup2 uptake hasn't progressed that fast. None of the major distros
> > or container frameworks are currently shipping with it although many
> > are evaluating switching. I don't think I'm too mistaken in that we
> > (FB) are at the bleeding edge in terms of adopting cgroup2 and its
> > various new features and are hitting these corner cases and oversights
> > in the process. If there are noticeable breakages arising from this
> > change, we sure can backpaddle but I think the better course of action
> > is fixing them up while we can.
>
> I do not really think you can go back. You cannot simply change semantic
> back and forth because you just break new users.
>
> Really, I do not see the semantic changing after more than 3 years of
> production ready interface. If you really believe we need a hierarchical
> notification mechanism for the reclaim activity then add a new one.

This discussion needs to be more nuanced.

We change interfaces and user-visible behavior all the time when we
think nobody is likely to rely on it. Sometimes we change them after
decades of established behavior - for example the recent OOM killer
change to not kill children over parents.

The argument was made that it's very unlikely that we break any
existing user setups relying specifically on this behavior we are
trying to fix. I don't see a real dispute to this, other than a
repetition of "we can't change it after three years".

I also don't see a concrete description of a plausible scenario that
this change might break.

I would like to see a solid case for why this change is a notable risk
to actual users (interface age is not a criterium for other changes)
before discussing errata solutions.