Re: [PATCH] mm: add group_oom_kill memory event

From: Dan Schatzberg
Date: Fri Dec 10 2021 - 15:00:13 EST


On Fri, Dec 03, 2021 at 04:45:54PM -0800, Shakeel Butt wrote:
> On Fri, Dec 3, 2021 at 8:24 AM Dan Schatzberg <schatzberg.dan@xxxxxxxxx> wrote:
> >
> > Our container agent wants to know when a container exits if it was OOM
> > killed or not to report to the user. We use memory.oom.group = 1 to
> > ensure that OOM kills within the container's cgroup kill
> > everything. Existing memory.events are insufficient for knowing if
> > this triggered:
> >
> > 1) Our current approach reads memory.events oom_kill and reports the
> > container was killed if the value is non-zero. This is erroneous in
> > some cases where containers create their children cgroups with
> > memory.oom.group=1 as such OOM kills will get counted against the
> > container cgroup's oom_kill counter despite not actually OOM killing
> > the entire container.
> >
> > 2) Reading memory.events.local will fail to identify OOM kills in leaf
> > cgroups (that don't set memory.oom.group) within the container cgroup.
> >
> > This patch adds a new oom_group_kill event when memory.oom.group
> > triggers to allow userspace to cleanly identify when an entire cgroup
> > is oom killed.
> >
> > Signed-off-by: Dan Schatzberg <schatzberg.dan@xxxxxxxxx>
>
> So, with this patch, will you be watching oom_group_kill from
> memory.events or memory.events.local file for your use-case?
>
> Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx>

We will watch from memory.events.local. If containers want to
construct their own child cgroups and allow for group oom to occur
inside, that's fine - a future container exit should not result in us
claiming the container was OOM killed. If the container exits and
memory.event.local shows oom_group_kill > 0 then we know the container
was OOM killed.