Re: [PATCH] oom_kill: oom_score_adj broken for processes with small memory usage

From: Corey Minyard
Date: Fri Jul 16 2021 - 08:25:53 EST


On Fri, Jul 16, 2021 at 07:19:24AM +0200, Michal Hocko wrote:
> On Thu 01-07-21 07:54:30, minyard@xxxxxxx wrote:
> > From: Corey Minyard <cminyard@xxxxxxxxxx>
> >
> > If you have a process with less than 1000 totalpages, the calculation:
> >
> > adj = (long)p->signal->oom_score_adj;
> > ...
> > adj *= totalpages / 1000;
> >
> > will always result in adj being zero no matter what oom_score_adj is,
> > which could result in the wrong process being picked for killing.
> >
> > Fix by adding 1000 to totalpages before dividing.
>
> Yes, this is a known limitation of the oom_score_adj and its scale.
> Is this a practical problem to be solved though? I mean 0-1000 pages is
> not really that much different from imprecision at a larger scale where
> tasks are effectively considered equal.

Known limitation? Is this documented? I couldn't find anything that
said "oom_score_adj doesn't work at all with programs with <1000 pages
besides setting the value to -1000".

>
> I have to say I do not really like the proposed workaround. It doesn't
> really solve the problem yet it adds another special case.

The problem is that if you have a small program, there is no way to
set it's priority besides completely disablling the OOM killer for
it.

I don't understand the special case comment. How is this adding a
special case? This patch removes a special case. Small programs
working different than big programs is a special case. Making them all
work the same is removing an element of surprise from someone expecting
things to work as documented.

-corey