Re: Machine lockups on extreme memory pressure

From: Michal Hocko
Date: Tue Sep 22 2020 - 13:01:10 EST


On Tue 22-09-20 09:51:30, Shakeel Butt wrote:
> On Tue, Sep 22, 2020 at 9:34 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Tue 22-09-20 09:29:48, Shakeel Butt wrote:
[...]
> > > Anyways, what do you think of the in-kernel PSI based
> > > oom-kill trigger. I think Johannes had a prototype as well.
> >
> > We have talked about something like that in the past and established
> > that auto tuning for oom killer based on PSI is almost impossible to get
> > right for all potential workloads and that so this belongs to userspace.
> > The kernel's oom killer is there as a last resort when system gets close
> > to meltdown.
>
> The system is already in meltdown state from the users perspective. I
> still think allowing the users to optionally set the oom-kill trigger
> based on PSI makes sense. Something like 'if all processes on the
> system are stuck for 60 sec, trigger oom-killer'.

We already do have watchdogs for that no? If you cannot really schedule
anything then soft lockup detector should fire. In a meltdown state like
that the reboot is likely the best way forward anyway.
--
Michal Hocko
SUSE Labs