Re: Re: [PATCH v2] mm: oom: introduce cpuset oom

From: Michal Hocko
Date: Thu Apr 06 2023 - 04:30:42 EST


On Thu 06-04-23 11:22:16, Gang Li wrote:
>
> On 2023/4/4 22:31, Michal Hocko wrote:
> > [CC cpuset people]
> >
> > The oom report should be explicit about this being CPUSET specific oom
> > handling so unexpected behavior could be nailed down to this change so I
> Yes, the oom message looks like this:
>
> ```
> [ 65.470256] oom-kill:constraint=CONSTRAINT_CPUSET,nodemask=(null),cpuset=test,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-4.scope,task=memkiller,pid=1968,uid=0
> Apr 4 09:08:53 debian kernel: [ 65.481992] Out of memory: Killed process
> 1968 (memkiller) total-vm:2099436kB, anon-rss:1971712kB, file-rss:1024kB,
> shmem-rss:0kB, UID:0 pgtables:3904kB oom_score_adj:0
> ```
>
>
> > do not see a major concern from the oom POV. Nevertheless it would be
> > still good to consider whether this should be an opt-in behavior. I
> > personally do not see a major problem because most cpuset deployments I
> > have seen tend to be well partitioned so the new behavior makes more
> > sense.
> >
>
> Since memcgroup oom is mandatory, cpuset oom should preferably be mandatory
> as well. But we can still consider adding an option to user.
>
> How about introduce `/proc/sys/vm/oom_in_cpuset`?

As I've said, I do not see any major concern having this behavior
implicit, the behavior makes semantic sense and it is also much more
likely that the selected oom victim will be a better choice than what we
do currently. Especially on properly partitioned systems with large
memory consumers in each partition (cpuset).

That being said, I would just not add any sysctl at this stage and
rather document the decision. If we ever encounter usecase(s) which
would regress based on this change we can introcuce the sysctl later.

--
Michal Hocko
SUSE Labs