Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

From: Roman Gushchin
Date: Wed Oct 04 2017 - 16:16:02 EST


On Wed, Oct 04, 2017 at 12:48:03PM -0700, Shakeel Butt wrote:
> > +
> > +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
> > +{
> > + struct mem_cgroup *iter;
> > +
> > + oc->chosen_memcg = NULL;
> > + oc->chosen_points = 0;
> > +
> > + /*
> > + * The oom_score is calculated for leaf memory cgroups (including
> > + * the root memcg).
> > + */
> > + rcu_read_lock();
> > + for_each_mem_cgroup_tree(iter, root) {
> > + long score;
> > +
> > + if (memcg_has_children(iter))
> > + continue;
>
> && iter != root_mem_cgroup ?

Oh, sure. I had a stupid bug in my test script, which prevented me from
catching this. Thanks!

This should fix the problem.
--
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2e82625bd354..b3848bce4c86 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2807,7 +2807,8 @@ static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
* We don't consider non-leaf non-oom_group memory cgroups
* as OOM victims.
*/
- if (memcg_has_children(iter) && !mem_cgroup_oom_group(iter))
+ if (memcg_has_children(iter) && iter != root_mem_cgroup &&
+ !mem_cgroup_oom_group(iter))
continue;

/*
@@ -2820,7 +2821,7 @@ static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc)
group_score = 0;
}

- if (memcg_has_children(iter))
+ if (memcg_has_children(iter) && iter != root_mem_cgroup)
continue;

score = oom_evaluate_memcg(iter, oc->nodemask, oc->totalpages);

--

>
> > +
> > + score = oom_evaluate_memcg(iter, oc->nodemask, oc->totalpages);
> > +
> > + /*
> > + * Ignore empty and non-eligible memory cgroups.
> > + */
> > + if (score == 0)
> > + continue;
> > +
> > + /*
> > + * If there are inflight OOM victims, we don't need
> > + * to look further for new victims.
> > + */
> > + if (score == -1) {
> > + oc->chosen_memcg = INFLIGHT_VICTIM;
> > + mem_cgroup_iter_break(root, iter);
> > + break;
> > + }
> > +
>
> Shouldn't there be a CSS_ONLINE check? Also instead of css_get at the
> end why not css_tryget_online() here and css_put for the previous
> selected one.

Hm, why do we need to check this? I do not see, how we can choose
an OFFLINE memcg as a victim, tbh. Please, explain the problem.

Thank you!