Re: [PATCH] Help Resource Counters Scale better (v4.1)

From: Daisuke Nishimura
Date: Thu Aug 13 2009 - 01:11:51 EST


On Thu, 13 Aug 2009 09:03:35 +0530, Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
> * nishimura@xxxxxxxxxxxxxxxxx <nishimura@xxxxxxxxxxxxxxxxx> [2009-08-13 10:03:50]:
>
> > > @@ -1855,9 +1883,14 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> > > break;
> > > }
> > >
> > > - res_counter_uncharge(&mem->res, PAGE_SIZE, &soft_limit_excess);
> > > - if (do_swap_account && (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT))
> > > - res_counter_uncharge(&mem->memsw, PAGE_SIZE, NULL);
> > > + if (!mem_cgroup_is_root(mem)) {
> > > + res_counter_uncharge(&mem->res, PAGE_SIZE, &soft_limit_excess);
> > > + if (do_swap_account &&
> > > + (ctype != MEM_CGROUP_CHARGE_TYPE_SWAPOUT))
> > > + res_counter_uncharge(&mem->memsw, PAGE_SIZE, NULL);
> > > + }
> > > + if (ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT && mem_cgroup_is_root(mem))
> > > + mem_cgroup_swap_statistics(mem, true);
> > I think mem_cgroup_is_root(mem) would be unnecessary here.
> > Otherwise, MEM_CGROUP_STAT_SWAPOUT of groups except root memcgroup wouldn't
> > be counted properly.
> >
>
> I think you have a valid point, but it will not impact us currently
> since we use SWAPOUT only for root accounting.
>
> >
> > > @@ -2461,10 +2496,26 @@ static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft)
> > > name = MEMFILE_ATTR(cft->private);
> > > switch (type) {
> > > case _MEM:
> > > - val = res_counter_read_u64(&mem->res, name);
> > > + if (name == RES_USAGE && mem_cgroup_is_root(mem)) {
> > > + val = mem_cgroup_read_stat(&mem->stat,
> > > + MEM_CGROUP_STAT_CACHE);
> > > + val += mem_cgroup_read_stat(&mem->stat,
> > > + MEM_CGROUP_STAT_RSS);
> > > + val <<= PAGE_SHIFT;
> > > + } else
> > > + val = res_counter_read_u64(&mem->res, name);
> > > break;
> > > case _MEMSWAP:
> > > - val = res_counter_read_u64(&mem->memsw, name);
> > > + if (name == RES_USAGE && mem_cgroup_is_root(mem)) {
> > > + val = mem_cgroup_read_stat(&mem->stat,
> > > + MEM_CGROUP_STAT_CACHE);
> > > + val += mem_cgroup_read_stat(&mem->stat,
> > > + MEM_CGROUP_STAT_RSS);
> > > + val += mem_cgroup_read_stat(&mem->stat,
> > > + MEM_CGROUP_STAT_SWAPOUT);
> > > + val <<= PAGE_SHIFT;
> > > + } else
> > > + val = res_counter_read_u64(&mem->memsw, name);
> > > break;
> > > default:
> > > BUG();
> > Considering use_hierarchy==1 case in the root memcgroup, shouldn't we use
> > mem_cgroup_walk_tree() here to sum up all the children's usage ?
> > *.usage_in_bytes show sum of all the children's usage now if use_hierarchy==1.
>
> If memory.use_hiearchy=1, we should use total_stats..right. Let me
> send out a newer version for review.
>
BTW, I don't think the patch title is suitable for this patch.
This patch doesn't make res_counter scalable at all :)

Of course, I think the patch using percpu_counter for scalability is also important.


Thanks,
Daisuke Nishimura.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/