[PATCH] memcg: fix css reference leak from mem_cgroup_iter

From: Michal Hocko
Date: Thu Jan 16 2014 - 10:17:13 EST

19f39402864e (memcg: simplify mem_cgroup_iter) has introduced a css
refrence leak (thus memory leak) because mem_cgroup_iter makes sure it
doesn't put a css reference on the root of the tree walk. The mentioned
commit however dropped the root check when the css reference is taken
while it keept the css_put optimization fora the root in place.

This means that css_put is not called and so css along with mem_cgroup
and other cgroup internal object tied by css lifetime are never freed.

Fix the issue by reintroducing root check in __mem_cgroup_iter_next.

This patch also fixes issue reported by Hugh Dickins when
mem_cgroup_iter might end up in an endless loop because a group which is
under hard limit reclaim is removed in parallel with iteration.
__mem_cgroup_iter_next would always return NULL because css_tryget on
the root (reclaimed memcg) would fail and there are no other memcg in
the hierarchy. prev == NULL in mem_cgroup_iter would prevent break out
from the root and so the while (!memcg) loop would never terminate.
as css_tryget is no longer called for the root of the tree walk this
doesn't happen anymore.

[hughd@xxxxxxxxxx: Fixed root vs. root->css fix]
[hughd@xxxxxxxxxx: Get rid of else branch because it is ugly]
Cc: stable@xxxxxxxxxxxxxxx # 3.10+
<Hugh's-selection>-by: Hugh Dickins <hughd@xxxxxxxxxx>
Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
mm/memcontrol.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f016d26adfd3..969f14d32b30 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1076,14 +1076,22 @@ skip_node:
* skipped and we should continue the tree walk.
* last_visited css is safe to use because it is
* protected by css_get and the tree walk is rcu safe.
+ *
+ * We do not take a reference on the root of the tree walk
+ * because we might race with the root removal when it would
+ * be the only node in the iterated hierarchy and mem_cgroup_iter
+ * would end up in an endless loop because it expects that at
+ * least one valid node will be returned. Root cannot disappear
+ * because caller of the iterator should hold it already so
+ * skipping css reference should be safe.
if (next_css) {
- if ((next_css->flags & CSS_ONLINE) && css_tryget(next_css))
+ if ((next_css->flags & CSS_ONLINE) &&
+ (next_css == root->css || css_tryget(next_css)))
return mem_cgroup_from_css(next_css);
- else {
- prev_css = next_css;
- goto skip_node;
- }
+ prev_css = next_css;
+ goto skip_node;

return NULL;

Michal Hocko
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/