Re: [PATCH v8] oom_kill.c: futex: Don't OOM reap the VMA containing the robust_list_head

From: Michal Hocko
Date: Fri Apr 08 2022 - 06:51:50 EST


On Fri 08-04-22 06:36:40, Nico Pache wrote:
>
>
> On 4/8/22 05:59, Michal Hocko wrote:
> > On Fri 08-04-22 05:40:09, Nico Pache wrote:
> >>
> >>
> >> On 4/8/22 05:36, Michal Hocko wrote:
> >>> On Fri 08-04-22 04:52:33, Nico Pache wrote:
> >>> [...]
> >>>> In a heavily contended CPU with high memory pressure the delay may also
> >>>> lead to other processes unnecessarily OOMing.
> >>>
> >>> Let me just comment on this part because there is likely a confusion
> >>> inlved. Delaying the oom_reaper _cannot_ lead to additional OOM killing
> >>> because the the oom killing is throttled by existence of a preexisting
> >>> OOM victim. In other words as long as there is an alive victim no
> >>> further victims are not selected and the oom killer backs off. The
> >>> oom_repaer will hide the alive oom victim after it is processed.
> >>> The longer the delay will be the longer an oom victim can block a
> >>> further progress but it cannot really cause unnecessary OOMing.
> >> Is it not the case that if we delay an OOM, the amount of available memory stays
> >> limited and other processes that are allocating memory can become OOM candidates?
> >
> > No. Have a look at oom_evaluate_task (tsk_is_oom_victim check).
> Ok I see.
>
> Doesnt the delay then allow the system to run into the following case more easily?:
> pr_warn("Out of memory and no killable processes...\n");
> panic("System is deadlocked on memory\n");

No. Aborting the oom victim search (above mentioned) will cause
out_of_memory to bail out and return to the page allocator. As I've said
the only problem with delaying the oom_reaper is that _iff_ the oom
victim cannot terminate (because it is stuck somewhere in the kernel)
on its own then the oom situation (be it global, cpuset or memcg) will
take longer so allocating tasks will not be able to make a forward
progress.

--
Michal Hocko
SUSE Labs