[PATCHSET] cpuset: decouple cpuset locking from cgroup core, take#2

From: Tejun Heo
Date: Thu Jan 03 2013 - 16:36:00 EST

Hello, guys.

This is the second attempt at decoupling cpuset locking from cgroup
core. Changes from the last take[L] are

* cpuset-drop-async_rebuild_sched_domains.patch moved from 0007 to
0009. This reordering makes cpu hotplug handling async first and
removes the temporary cyclic locking dependency.

* 0006-cpuset-cleanup-cpuset-_can-_attach.patch no longer converts
cpumask_var_t to cpumask_t as per Rusty Russell.

* 0008-cpuset-don-t-nest-cgroup_mutex-inside-get_online_cpu.patch now
synchronously rebuilds sched domains from cpu hotplug callback.
This fixes various issues caused by confused scheduler puttings
tasks into a dead cpu including the RCU stall problem reported by Li

Original patchset description follows.

Depending on cgroup core locking - cgroup_mutex - is messy and makes
cgroup prone to locking dependency problems. The current code already
has lock dependency loop - memcg nests get_online_cpus() inside
cgroup_mutex. cpuset the other way around.

Regardless of the locking details, whatever is protecting cgroup has
inherently to be something outer to most other locking constructs.
cgroup calls into a lot of major subsystems which in turn have to
perform subsystem-specific locking. Trying to nest cgroup
synchronization inside other locks isn't something which can work

cgroup now has enough API to allow subsystems to implement their own
locking and cgroup_mutex is scheduled to be made private to cgroup
core. This patchset makes cpuset implement its own locking instead of
relying on cgroup_mutex.

cpuset is rather nasty in this respect. Some of it seems to have come
from the implementation history - cgroup core grew out of cpuset - but
big part stems from cpuset's need to migrate tasks to an ancestor
cgroup when an hotunplug event makes a cpuset empty (w/o any cpu or

This patchset decouples cpuset locking from cgroup_mutex. After the
patchset, cpuset uses cpuset-specific cpuset_mutex instead of
cgroup_mutex. This also removes the lockdep warning triggered during
cpu offlining (see 0009).

Note that this leaves memcg as the only external user of cgroup_mutex.
Michal, Kame, can you guys please convert memcg to use its own locking

This patchset contains the following thirteen patches.


0001-0006 are prep patches.

0007-0009 make cpuset nest get_online_cpus() inside cgroup_mutex, not
the other way around.

0010-0012 plug holes which would be exposed by switching to
cpuset-specific locking.

0013 replaces cgroup_mutex with cpuset_mutex.

This patchset is on top of v3.8-rc2 (d1c3ed669a) and also available in
the following git branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cpuset-locking

diffstat follows.

kernel/cpuset.c | 760 ++++++++++++++++++++++++++++++++------------------------
1 file changed, 438 insertions(+), 322 deletions(-)



[L] http://thread.gmane.org/gmane.linux.kernel.cgroups/5251
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/