Re: Machine lockups on extreme memory pressure

From: Michal Hocko
Date: Tue Sep 22 2020 - 12:34:04 EST

On Tue 22-09-20 09:29:48, Shakeel Butt wrote:
> On Tue, Sep 22, 2020 at 8:16 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Tue 22-09-20 06:37:02, Shakeel Butt wrote:
> > > I talked about this problem with Johannes at LPC 2019 and I think we
> > > talked about two potential solutions. First was to somehow give memory
> > > reserves to oomd and second was in-kernel PSI based oom-killer. I am
> > > not sure the first one will work in this situation but the second one
> > > might help.
> >
> > Why does your oomd depend on memory allocation?
> >
> It does not but I think my concern was the potential allocations
> during syscalls.

So what is the problem then? Why your oomd cannot kill anything?

> Anyways, what do you think of the in-kernel PSI based
> oom-kill trigger. I think Johannes had a prototype as well.

We have talked about something like that in the past and established
that auto tuning for oom killer based on PSI is almost impossible to get
right for all potential workloads and that so this belongs to userspace.
The kernel's oom killer is there as a last resort when system gets close
to meltdown.
Michal Hocko