Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

From: Michal Hocko
Date: Mon Mar 16 2020 - 05:31:58 EST


On Thu 12-03-20 15:32:38, Andrew Morton wrote:
> On Thu, 12 Mar 2020 11:07:15 -0700 (PDT) David Rientjes <rientjes@xxxxxxxxxx> wrote:
>
> > On Thu, 12 Mar 2020, Tetsuo Handa wrote:
> >
> > > > On Thu, 12 Mar 2020, Tetsuo Handa wrote:
> > > > > > If you have an alternate patch to try, we can test it. But since this
> > > > > > cond_resched() is needed anyway, I'm not sure it will change the result.
> > > > >
> > > > > schedule_timeout_killable(1) is an alternate patch to try; I don't think
> > > > > that this cond_resched() is needed anyway.
> > > > >
> > > >
> > > > You are suggesting schedule_timeout_killable(1) in shrink_node_memcgs()?
> > > >
> > >
> > > Andrew Morton also mentioned whether cond_resched() in shrink_node_memcgs()
> > > is enough. But like you mentioned,
> > >
> >
> > It passes our testing because this is where the allocator is looping while
> > the victim is trying to exit if only it could be scheduled.
>
> What happens if the allocator has SCHED_FIFO?

The same thing as a SCHED_FIFO running in a tight loop in the userspace.

As long as a high priority context depends on a resource held by a low
priority task then we have a priority inversion problem and the page
allocator is no real exception here. But I do not see the allocator
is much different from any other code in the kernel. We do not add
random sleeps here and there to push a high priority FIFO or RT tasks
out of the execution context. We do cond_resched to help !PREEMPT
kernels but priority related issues are really out of scope of that
facility.
--
Michal Hocko
SUSE Labs