Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

From: Michal Hocko
Date: Thu Aug 20 2020 - 10:34:40 EST


On Thu 20-08-20 23:26:29, Tetsuo Handa wrote:
> On 2020/08/20 23:15, Michal Hocko wrote:
> > I would tend to agree that from the userspace POV it is nice to look at
> > oom tuning per process but fundamentaly the oom killer operates on the
> > address space much more than other resources bound to a process because
> > it is usually the address space hogging the largest portion of the
> > memory footprint. This is the reason why the oom killer has been
> > evaluating tasks based on that aspect rather than other potential memory
> > consumers bound to a task. Mostly due to lack of means to evaluate
> > those.
>
> We already allow specifying potential memory consumers via oom_task_origin().

oom_task_origin is a single purpose hack to handle swapoff situation
more gracefully. By no means this is something to base the behavior on.

> If we change from a property of the task/thread-group to a property of mm,
> we won't be able to add means to adjust oom score based on other potential
> memory consumers bound to a task (e.g. pipes) in the future.

While that would be really nice to achieve I am not really sure this is
feasible. Mostly because accounting shared resources like pipes but fd
based resources in general is really hard to do right without any
surprises. Pipes are not really bound to a specific process for example.
You are free to hand over fd to a different process for example.
--
Michal Hocko
SUSE Labs