Re: [RFC] 3.10 kernel- oom with about 24G free memory

From: Michal Hocko
Date: Fri Feb 10 2017 - 04:25:26 EST


On Fri 10-02-17 17:15:59, Yisheng Xie wrote:
> Hi Michal,
>
> Thanks for comment!
> On 2017/2/10 16:52, Michal Hocko wrote:
> > On Fri 10-02-17 16:48:58, Yisheng Xie wrote:
> >> Hi Michal,
> >>
> >> Thanks for comment!
> >> On 2017/2/10 15:09, Michal Hocko wrote:
> >>> On Fri 10-02-17 09:13:58, Yisheng Xie wrote:
> >>>> hi Michal,
> >>>> Thanks for your comment.
> >>>>
> >>>> On 2017/2/9 21:41, Michal Hocko wrote:
> > [...]
> >>>>>> OK, so this is a memcg OOM killer which panics because the configuration
> >>>>>> says so. The OOM report doesn't say so and that is the bug. dump_header
> >>>>>> is memcg aware and mem_cgroup_out_of_memory initializes oom_control
> >>>>>> properly. Is this Vanilla kernel?
> >>>>
> >>>> That means we should raise the limit of that memcg to avoid memcg OOM killer, right?
> >>>
> >>> Why do you configure the system to panic on memcg OOM in the first
> >>> place. This is a wrong thing to do in 99% of cases.
> >>
> >> For our production think it should use reboot to recovery the system when OOM,
> >> instead of killing user's key process. Maybe not the right thing.
> >
> > I can understand that for the global oom killer but not for memcg. You
> > can recover the oom even without killing any process. You can simply
> > increase the limit from the userspace when the oom event is triggered.
>
> So you mean set oom_kill_disable and increase the limit from userspace
> when memcg under_oom, right?

yes
--
Michal Hocko
SUSE Labs