Re: [PATCH v3][0/6] memcg: prevent -ENOMEM in pre_destroy()

From: Kamezawa Hiroyuki
Date: Thu Jun 28 2012 - 04:36:12 EST


(2012/06/28 2:58), Tejun Heo wrote:
Hello, KAME.

On Fri, Jun 22, 2012 at 08:27:25AM +0900, Kamezawa Hiroyuki wrote:
Remaining 20% of work is based on a modification to cgroup layer

How do you think this patch ? (This patch is not tested yet...so
may have troubles...) I think callers of pre_destory() is not so many...

==
From a28db946f91f3509d25779e8c5db249506cc4b07 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
Date: Fri, 22 Jun 2012 08:38:38 +0900
Subject: [PATCH] cgroup: keep cgroup_mutex() while calling ->pre_destroy()

In past, memcg's pre_destroy() was verrry slow because of the possibility
of page reclaiming in it. So, cgroup_mutex() was released before calling
pre_destroy() callbacks. Now, it's enough fast. memcg just scans the list
and move pages to other cgroup, no memory reclaim happens.
Then, we can keep cgroup_mutex() there.

By holding looks, we can avoid following cases
1. new task is attached while rmdir().
2. new child cgroup is created while rmdir()
3. new task is attached to cgroup and removed from cgroup before
checking css's count. So, ->destroy() will be called even if
some trashes by the task remains

(3. is terrible case...even if I think it will not happen in real world..)

Ooh, once memcg drops the __DEPRECATED_clear_css_refs, cgroup_rmdir()
will mark the cgroup dead before start calling pre_destroy() and none
of the above will happen.


Hm, threads which touches memcg should hold memcg's reference count rather than css.
Right ? IIUC, one of reason is a reference from kswapd etc...hm. I'll check it.

Thanks,
-Kame










--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/