Re: [RFC] 3.10 kernel- oom with about 24G free memory

From: Michal Hocko
Date: Fri Feb 10 2017 - 03:54:12 EST


On Fri 10-02-17 16:48:58, Yisheng Xie wrote:
> Hi Michal,
>
> Thanks for comment!
> On 2017/2/10 15:09, Michal Hocko wrote:
> > On Fri 10-02-17 09:13:58, Yisheng Xie wrote:
> >> hi Michal,
> >> Thanks for your comment.
> >>
> >> On 2017/2/9 21:41, Michal Hocko wrote:
[...]
> >>>> OK, so this is a memcg OOM killer which panics because the configuration
> >>>> says so. The OOM report doesn't say so and that is the bug. dump_header
> >>>> is memcg aware and mem_cgroup_out_of_memory initializes oom_control
> >>>> properly. Is this Vanilla kernel?
> >>
> >> That means we should raise the limit of that memcg to avoid memcg OOM killer, right?
> >
> > Why do you configure the system to panic on memcg OOM in the first
> > place. This is a wrong thing to do in 99% of cases.
>
> For our production think it should use reboot to recovery the system when OOM,
> instead of killing user's key process. Maybe not the right thing.

I can understand that for the global oom killer but not for memcg. You
can recover the oom even without killing any process. You can simply
increase the limit from the userspace when the oom event is triggered.

Trigerring the panic on memcg oom killer is both dangerous and most
probably something you do not want.
--
Michal Hocko
SUSE Labs