Re: [PATCH 1/2] mm, oom: introduce oom reaper
From: Tetsuo Handa
Date: Tue Feb 02 2016 - 06:48:16 EST
Michal Hocko wrote:
> > In this case, the oom reaper has ignored the next victim and doesn't do
> > anything; the simple race has prevented it from zapping memory and does
> > not reduce the livelock probability.
> >
> > This can be solved either by queueing mm's to reap or involving the oom
> > reaper into the oom killer synchronization itself.
>
> as we have already discussed previously oom reaper is really tricky to
> be called from the direct OOM context. I will go with queuing.
>
OK. But it is not easy to build a reliable OOM-reap queuing chain. I think
that a dedicated kernel thread which does OOM-kill operation and OOM-reap
operation will be expected. That will also handle the "sleeping for too
long with oom_lock held after sending SIGKILL" problem.
> > I'm baffled by any reference to "memcg oom heavy loads", I don't
> > understand this paragraph, sorry. If a memcg is oom, we shouldn't be
> > disrupting the global runqueue by running oom_reaper at a high priority.
> > The disruption itself is not only in first wakeup but also in how long the
> > reaper can run and when it is rescheduled: for a lot of memory this is
> > potentially long. The reaper is best-effort, as the changelog indicates,
> > and we shouldn't have a reliance on this high priority: oom kill exiting
> > can't possibly be expected to be immediate. This high priority should be
> > removed so memcg oom conditions are isolated and don't affect other loads.
>
> If this is a concern then I would be tempted to simply disable oom
> reaper for memcg oom altogether. For me it is much more important that
> the reaper, even though a best effort, is guaranteed to schedule if
> something goes terribly wrong on the machine.
I think that if something goes terribly wrong on the machine, a guarantee for
scheduling the reaper will not help unless we build a reliable queuing chain.
Building a reliable queuing chain will break some of assumptions provided by
current behavior. For me, a guarantee for scheduling for next OOM-kill
operation (with globally opening some or all of memory reserves) before
building a reliable queuing chain is much more important.
> But ohh well... I will queue up a patch to do this
> on top. I plan to repost the full patchset shortly.
Maybe we all agree with introducing OOM reaper without queuing, but I do
want to see a guarantee for scheduling for next OOM-kill operation before
trying to build a reliable queuing chain.