[PATCHSET for-4.13] cgroup: implement cgroup2 thread mode, v3

From: Tejun Heo
Date: Sun Jul 16 2017 - 22:07:47 EST


Hello,

This is v3 of cgroup2 thread mode patchset. The changes from v2[L]
are

* Switched to marking each cgroup threaded instead of doing it
per-subtree as suggested by PeterZ. This allows more flexibility
and removes certain interface quirks.

* Dropped RFC tag and excluded cpu controller patches from this
patchset as threaded mode behaviors can easily be verified with the
pid controller. Will follow up with cpu controller patchset later.

It is largely based on the discussions that we had at the plumbers
last year. Here's the rough outline.

* Thread mode is explicitly enabled on a cgroup by writing "threaded"
into "cgroup.type" file. The cgroup shouldn't have any processes or
child cgroups. A threaded cgroup joins the the parent's resource
domain and becomes a part of the threaded subtree anchored at the
nearest domain ancestor, which is called the threaded domain cgroup
of the subtree.

* Threads can be put anywhere in a threaded subtree by writing TIDs
into "cgroup.threads" file. Process granularity and
no-internal-process constraint don't apply in a threaded subtree.

* To be used in a threaded subtree, controllers should explicitly
declare thread mode support and should be able to handle internal
competition in some way.

* The threaded domain cgroup of a threaded subtree serves as the
resource domain for the whole subtree. This is where all the
controllers are guaranteed to have a common ground and resource
consumptions in the threaded subtree which aren't tied to a specific
thread are charged. Non-threaded controllers never see beyond
thread root and can assume that all controllers will follow the same
rules upto that point.

* Unlike other cgroups, the system root cgroup can serve as parent to
domain child cgroups and threaded domains to threaded subtrees.

This allows threaded controllers to implement thread granular resource
control without getting in the way of system level resource
partitioning.

For more details on the interface and behavior, please refer to 0005.

This patchset contains the following six patches.

0001-cgroup-reorganize-cgroup.procs-task-write-path.patch
0002-cgroup-add-flags-to-css_task_iter_start-and-implemen.patch
0003-cgroup-introduce-cgroup-dom_cgrp-and-threaded-css_se.patch
0004-cgroup-implement-CSS_TASK_ITER_THREADED.patch
0005-cgroup-implement-cgroup-v2-thread-support.patch
0006-cgroup-update-debug-controller-to-print-out-thread-m.patch

0001-0005 implement cgroup2 thread mode. 0006 enables debug
controller on it.

The patchset is based on the current cgroup/for-4.14 27f26753f8c0
("cgroup: replace css_set walking populated test with testing
cgrp->nr_populated_csets") and also available in the following git
branch.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup2-threads-v3

diffstat follows.

Documentation/cgroup-v2.txt | 181 +++++++++-
include/linux/cgroup-defs.h | 45 ++
include/linux/cgroup.h | 15
kernel/cgroup/cgroup-internal.h | 12
kernel/cgroup/cgroup-v1.c | 69 +++
kernel/cgroup/cgroup.c | 712 +++++++++++++++++++++++++++++++---------
kernel/cgroup/cpuset.c | 6
kernel/cgroup/debug.c | 58 ++-
kernel/cgroup/freezer.c | 6
kernel/cgroup/pids.c | 1
kernel/events/core.c | 1
mm/memcontrol.c | 2
net/core/netclassid_cgroup.c | 2
13 files changed, 906 insertions(+), 204 deletions(-)

Thanks.

--
tejun

[L] http://lkml.kernel.org/r/20170610140351.10703-1-tj@xxxxxxxxxx