Re: [PATCH] doc: memcontrol: add description for oom_kill

From: Yang Shi
Date: Mon Mar 01 2021 - 15:33:45 EST


On Mon, Mar 1, 2021 at 4:15 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Fri 26-02-21 08:42:29, Yang Shi wrote:
> > On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Thu 25-02-21 18:12:54, Yang Shi wrote:
> > > > When debugging an oom issue, I found the oom_kill counter of memcg is
> > > > confusing. At the first glance without checking document, I thought it
> > > > just counts for memcg oom, but it turns out it counts both global and
> > > > memcg oom.
> > >
> > > Yes, this is the case indeed. The point of the counter was to count oom
> > > victims from the memcg rather than matching that to the source of the
> > > oom. Rememeber that this could have been a memcg oom up in the
> > > hierarchy as well. Counting victims on the oom origin could be equally
> >
> > Yes, it is updated hierarchically on v2, but not on v1. I'm supposed
> > this is because v1 may work in non-hierarchcal mode? If this is the
> > only reason we may be able to remove this to get aligned with v2 since
> > non-hierarchal mode is no longer supported.
>
> I believe the reson is that v1 can have tasks in the intermediate
> (non-leaf) memcgs. So you wouldn't have a way to tell whether the oom
> kill has happened in such a memcg or somewhere down the hierarchy.

Aha, I forgot it, that's bad. Although we don't have tasks in
intermediate nodes in practice, I do understand it is not forbidden as
cgroup v2.

> --
> Michal Hocko
> SUSE Labs