Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary
From: Christian Brauner
Date: Thu Aug 20 2020 - 11:07:48 EST
On Thu, Aug 20, 2020 at 09:49:11AM -0500, Eric W. Biederman wrote:
> Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> writes:
>
> > On 2020/08/20 23:00, Christian Brauner wrote:
> >> On Thu, Aug 20, 2020 at 10:48:43PM +0900, Tetsuo Handa wrote:
> >>> On 2020/08/20 22:34, Christian Brauner wrote:
> >>>> On Thu, Aug 20, 2020 at 03:26:31PM +0200, Michal Hocko wrote:
> >>>>> If you can handle vfork by other means then I am all for it. There were
> >>>>> no patches in that regard proposed yet. Maybe it will turn out simpler
> >>>>> then the heavy lifting we have to do in the oom specific code.
> >>>>
> >>>> Eric's not wrong. I fiddled with this too this morning but since
> >>>> oom_score_adj is fiddled with in a bunch of places this seemed way more
> >>>> code churn then what's proposed here.
> >>>
> >>> I prefer simply reverting commit 44a70adec910d692 ("mm, oom_adj: make sure
> >>> processes sharing mm have same view of oom_score_adj").
> >>>
> >>> https://lore.kernel.org/patchwork/patch/1037208/
> >>
> >> I guess this is a can of worms but just or the sake of getting more
> >> background: the question seems to be whether the oom adj score is a
> >> property of the task/thread-group or a property of the mm. I always
> >> thought the oom score is a property of the task/thread-group and not the
> >> mm which is also why it lives in struct signal_struct and not in struct
> >> mm_struct. But
> >>
> >> 44a70adec910 ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj")
> >>
> >> reads like it is supposed to be a property of the mm or at least the
> >> change makes it so.
> >
> > Yes, 44a70adec910 is trying to go towards changing from a property of the task/thread-group
> > to a property of mm. But I don't think we need to do it at the cost of "__set_oom_adj() latency
> > Yong-Taek Lee and Tim Murray have reported" and "complicity for supporting
> > vfork() => __set_oom_adj() => execve() sequence".
>
> The thing is commit 44a70adec910d692 ("mm, oom_adj: make sure processes
> sharing mm have same view of oom_score_adj") has been in the tree for 4
> years.
>
> That someone is just now noticing a regression is their problem. The
> change is semantics is done and decided. We can not reasonably revert
> at this point without risking other regressions.
>
> Given that the decision has already been made to make oom_adj
> effectively per mm. There is no point on have a debate if we should do
> it.
I mean yeah, I think no-one really was going to jump on the revert-train.
At least for me the historical background was quite important to know
actually. The fact that by sharing the mm one can effectively be bound
to the oom score of another thread-group is kinda suprising and the
oom_score_adj section in man proc doesn't mention this.
And I really think we need to document this. Either on the clone() or on
the oom_score_adj page...
Christian