Re: [PATCH v4] Print the memcg's name when system-wide OOM happened

From: Michal Hocko
Date: Thu May 31 2018 - 02:49:59 EST


On Wed 30-05-18 13:42:56, Andrew Morton wrote:
> On Mon, 21 May 2018 03:39:46 +0100 ufo19890607 <ufo19890607@xxxxxxxxx> wrote:
>
> > From: yuzhoujian <yuzhoujian@xxxxxxxxxxxxxxx>
> >
> > The dump_header does not print the memcg's name when the system
> > oom happened. So users cannot locate the certain container which
> > contains the task that has been killed by the oom killer.
> >
> > System oom report will print the memcg's name after this patch,
> > so users can get the memcg's path from the oom report and check
> > the certain container more quickly.
>
> lkp-robot is reporting an oops.
>
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -433,6 +433,7 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
> > if (is_memcg_oom(oc))
> > mem_cgroup_print_oom_info(oc->memcg, p);
> > else {
> > + mem_cgroup_print_oom_memcg_name(oc->memcg, p);
> > show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
> > if (is_dump_unreclaim_slabs())
> > dump_unreclaimable_slab();
>
> static inline bool is_memcg_oom(struct oom_control *oc)
> {
> return oc->memcg != NULL;
> }
>
> So in the mem_cgroup_print_oom_memcg_name() call which this patch adds,
> oc->memcg is known to be NULL. How can this possibly work?

This version is broken. The current version [1] seems to be doing the
right thing in that regards AFAICS. It has some other issues though.
Can we drop the current code from the mmotm tree and start over?

[1] http://lkml.kernel.org/r/1527413551-5982-1-git-send-email-ufo19890607@xxxxxxxxx
--
Michal Hocko
SUSE Labs