Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable
From: Andrew Morton
Date: Tue Jan 30 2018 - 14:40:06 EST
On Tue, 30 Jan 2018 13:20:11 +0100 Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> Subject: [PATCH] oom, memcg: clarify root memcg oom accounting
>
> David Rientjes has pointed out that the current way how the root memcg
> is accounted for the cgroup aware OOM killer is undocumented. Unlike
> regular cgroups there is no accounting going on in the root memcg
> (mostly for performance reasons). Therefore we are suming up oom_badness
> of its tasks. This might result in an over accounting because of the
> oom_score_adj setting. Document this for now.
Thanks. Some tweakage:
--- a/Documentation/cgroup-v2.txt~mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix
+++ a/Documentation/cgroup-v2.txt
@@ -1292,13 +1292,13 @@ of the OOM'ing cgroup.
Leaf cgroups and cgroups with oom_group option set are compared based
on their cumulative memory usage. The root cgroup is treated as a
-leaf memory cgroup as well, so it's compared with other leaf memory
+leaf memory cgroup as well, so it is compared with other leaf memory
cgroups. Due to internal implementation restrictions the size of
-the root cgroup is a cumulative sum of oom_badness of all its tasks
+the root cgroup is the cumulative sum of oom_badness of all its tasks
(in other words oom_score_adj of each task is obeyed). Relying on
-oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
-underestimating of the root cgroup consumption and it is therefore
-discouraged. This might change in the future, though.
+oom_score_adj (apart from OOM_SCORE_ADJ_MIN) can lead to over- or
+underestimation of the root cgroup consumption and it is therefore
+discouraged. This might change in the future, however.
If there are no cgroups with the enabled memory controller,
the OOM killer is using the "traditional" process-based approach.
_