Re: [PATCH] mm/oom_kill: count global and memory cgroup oom kills

From: David Rientjes
Date: Wed May 24 2017 - 16:43:44 EST


On Tue, 23 May 2017, Konstantin Khlebnikov wrote:

> This is worth addition. Let's call it "oom_victim" for short.
>
> It allows to locate leaky part if they are spread over sub-containers within
> common limit.
> But doesn't tell which limit caused this kill. For hierarchical limits this
> might be not so easy.
>
> I think oom_kill better suits for automatic actions - restart affected
> hierarchy, increase limits, e.t.c.
> But oom_victim allows to determine container affected by global oom killer.
>
> So, probably it's worth to merge them together and increment oom_kill by
> global killer for victim memcg:
>
> if (!is_memcg_oom(oc)) {
> count_vm_event(OOM_KILL);
> mem_cgroup_count_vm_event(mm, OOM_KILL);
> } else
> mem_cgroup_event(oc->memcg, OOM_KILL);
>

Our complete solution is that we have a complementary
memory.oom_kill_control that allows users to register for eventfd(2)
notification when the kernel oom killer kills a victim, but this is
because we have had complete support for userspace oom handling for years.
When read, it exports three classes of information:

- the "total" (hierarchical) and "local" (memcg specific) number of oom
kills for system oom conditions (overcommit),

- the "total" and "local" number of oom kills for memcg oom conditions,
and

- the total number of processes in the hierarchy where an oom victim was
reaped successfully and unsuccessfully.

One benefit of this is that it prevents us from having to scrape the
kernel log for oom events which has been troublesome in the past, but
userspace can easily do so when the eventfd triggers for the kill
notification.