Re: [PATCH] memcg: oom: ignore oom warnings from memory.max
From: Michal Hocko
Date: Mon May 04 2020 - 03:36:03 EST
On Mon 04-05-20 15:26:52, Yafang Shao wrote:
> On Mon, May 4, 2020 at 3:03 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > On Fri 01-05-20 09:39:24, Yafang Shao wrote:
> > > On Fri, May 1, 2020 at 2:27 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
> > > >
> > > > Lowering memory.max can trigger an oom-kill if the reclaim does not
> > > > succeed. However if oom-killer does not find a process for killing, it
> > > > dumps a lot of warnings.
> > > >
> > >
> > > I have been confused by this behavior for several months and I think
> > > it will confuse more memcg users.
> >
> > Could you be more specific what has caused the confusion?
> >
>
> No task is different from no eligible task.
> No eligible task means there are some candidates but no one is eligible.
> Whille no task means there is no candidate.
I really fail to see a difference. It is clear the one is subset of the
other but in practical life tasks might come and go at any time and if
you try to reduce the hard limit and there are no tasks that could be
reclaimed then I believe we should complain whether it is only oom
disabled tasks or no tasks at all. It is certainly unexpected situation
in some cases because there are resources which are bound to the memcg
without any task we can act on.
> > > We should keep the memcg oom behavior consistent with system oom - no
> > > oom kill if no process.
> >
> > This is not the global mmemcg behavior. We do complain loud on no
> > eligible tasks and actually panic the system. Memcg cannot simply
> > do the same by default for obvious reasons.
> >
>
> As explianed above, no eligible task is different from no task.
> If there are some candidates but no one is eligible, the system will panic.
> While if there's no task, it is definitely no OOM, because that's an
> improssible thing for the system.
This is very much possible situation when all eligible tasks have been
already killed but they didn't really help to resolve the oom situation
- e.g. in kernel memory leak or unbounded shmem consumption etc...
--
Michal Hocko
SUSE Labs