Re: [PATCH] Add the memcg print oom info for system oom

From: David Rientjes
Date: Mon May 21 2018 - 16:16:47 EST


On Thu, 17 May 2018, Michal Hocko wrote:

> this is not 5 lines at all. We dump memcg stats for the whole oom memcg
> subtree. For your patch it would be the whole subtree of the memcg of
> the oom victim. With cgroup v1 this can be quite deep as tasks can
> belong to inter-nodes as well. Would be
>
> pr_info("Task in ");
> pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id));
> pr_cont(" killed as a result of limit of ");
>
> part of that output sufficient for your usecase?

There's no memcg to print as the limit in the above, but it does seem like
the single line output is all that is needed in this case.

It might be useful to discuss a single line output that specifies relevant
information about the context of the oom kill, the killed thread, and the
memcg of that thread, in a way that will be backwards compatible. The
messages in the oom killer have been restructured over time, I don't
believe there is a backwards compatible way to search for an oom event in
the kernel log.

I've had success with defining a single line output the includes the
CONSTRAINT_* of the oom kill, the origin and kill memcgs, the thread name,
pid, and uid. On system oom kills, origin and kill memcgs are left empty.

oom-kill constraint=CONSTRAINT_* origin_memcg=<memcg> kill_memcg=<memcg> task=<comm> pid=<pid> uid=<uid>

Perhaps we should introduce a single line output that will be backwards
compatible that includes this information?