Re: [RFC][PATCH -mm 0/7] memcg: lockless page_cgroup v1

From: KAMEZAWA Hiroyuki
Date: Wed Aug 20 2008 - 22:11:55 EST


On Wed, 20 Aug 2008 20:00:06 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> On Wed, 20 Aug 2008 19:41:08 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>
> > On Wed, 20 Aug 2008 18:53:06 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> >
> > > Hi, this is a patch set for lockless page_cgroup.
> > >
> > > dropped patches related to mem+swap controller for easy review.
> > > (I'm rewriting it, too.)
> > >
> > > Changes from current -mm is.
> > > - page_cgroup->flags operations is set to be atomic.
> > > - lock_page_cgroup() is removed.
> > > - page->page_cgroup is changed from unsigned long to struct page_cgroup*
> > > - page_cgroup is freed by RCU.
> > > - For avoiding race, charge/uncharge against mm/memory.c::insert_page() is
> > > omitted. This is ususally used for mapping device's page. (I think...)
> > >
> > > In my quick test, perfomance is improved a little. But the benefit of this
> > > patch is to allow access page_cgroup without lock. I think this is good
> > > for Yamamoto's Dirty page tracking for memcg.
> > > For I/O tracking people, I added a header file for allowing access to
> > > page_cgroup from out of memcontrol.c
> > >
> > > The base kernel is recent mmtom. Any comments are welcome.
> > > This is still under test. I have to do long-run test before removing "RFC".
> > >
> > Known problem: force_emtpy is broken...so rmdir will struck into nightmare.
> > It's because of patch 2/7.
> > will be fixed in the next version.
> >
>
> This is a quick fix but I think I can find some better solution..
> ==
> Because removal from LRU is delayed, mz->lru will never be empty until
> someone kick drain. This patch rotate LRU while force_empty and makes
> page_cgroup will be freed.
>

I'd like to rewrite force_empty to move all usage to "default" cgroup.
There are some reasons.

1. current force_empty creates an alive page which has no page_cgroup.
This is bad for routine which want to access page_cgroup from page.
And this behavior will be an issue of race condition in future.
2. We can see amount of out-of-control usage in default cgroup.

But to do this, I'll have to avoid "hitting limit" in default cgroup.
I'm now wondering to make it impossible to set limit to default cgroup.
(will show as a patch in the next version of series.)
Does anyone have an idea ?

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/