Re: [PATCH 1/7] Fix mem_cgroup_hierarchical_reclaim() to do stablehierarchy walk.

From: Michal Hocko
Date: Wed Jun 22 2011 - 14:33:39 EST


On Wed 22-06-11 17:15:00, Michal Hocko wrote:
> On Thu 16-06-11 12:51:41, KAMEZAWA Hiroyuki wrote:
> [...]
> > @@ -1667,41 +1668,28 @@ static int mem_cgroup_hierarchical_recla
> > if (!check_soft && root_mem->memsw_is_minimum)
> > noswap = true;
> >
> > - while (1) {
> > +again:
> > + if (!shrink) {
> > + visit = 0;
> > + for_each_mem_cgroup_tree(victim, root_mem)
> > + visit++;
> > + } else {
> > + /*
> > + * At shrinking, we check the usage again in caller side.
> > + * so, visit children one by one.
> > + */
> > + visit = 1;
> > + }
> > + /*
> > + * We are not draining per cpu cached charges during soft limit reclaim
> > + * because global reclaim doesn't care about charges. It tries to free
> > + * some memory and charges will not give any.
> > + */
> > + if (!check_soft)
> > + drain_all_stock_async(root_mem);
> > +
> > + while (visit--) {
>
> This is racy, isn't it? What prevents some groups to disapear in the
> meantime? We would reclaim from those that are left more that we want.
>
> Why cannot we simply do something like (totally untested):
>
> Index: linus_tree/mm/memcontrol.c
> ===================================================================
> --- linus_tree.orig/mm/memcontrol.c 2011-06-22 17:11:54.000000000 +0200
> +++ linus_tree/mm/memcontrol.c 2011-06-22 17:13:05.000000000 +0200
> @@ -1652,7 +1652,7 @@ static int mem_cgroup_hierarchical_recla
> unsigned long reclaim_options,
> unsigned long *total_scanned)
> {
> - struct mem_cgroup *victim;
> + struct mem_cgroup *victim, *first_victim = NULL;
> int ret, total = 0;
> int loop = 0;
> bool noswap = reclaim_options & MEM_CGROUP_RECLAIM_NOSWAP;
> @@ -1669,6 +1669,11 @@ static int mem_cgroup_hierarchical_recla
>
> while (1) {
> victim = mem_cgroup_select_victim(root_mem);
> + if (!first_victim)
> + first_victim = victim;
> + else if (first_victim == victim)
> + break;

this will obviously need css_get and css_put to make sure that the group
doesn't disappear in the meantime.

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/