[PATCH 1/2] cgroup: Fix /proc/cgroups count for v2

From: T.J. Mercier
Date: Tue May 28 2024 - 12:37:44 EST


The /proc/cgroups documentation says that the num_cgroups value is,
"the number of control groups in this hierarchy using this controller."

The value printed is simply the total number of cgroups in the hierarchy
which is correct for v1, but not for the shared v2 hierarchy.

Consider:
controllers="cpuset cpu io memory hugetlb pids rdma misc"
for c in $controllers
do
echo +$c > /sys/fs/cgroup/cgroup.subtree_control
mkdir /sys/fs/cgroup/$c
echo +$c > /sys/fs/cgroup/$c/cgroup.subtree_control
for i in `seq 100`; do mkdir /sys/fs/cgroup/$c/$i; done
done
cat /proc/cgroups

cpuset 0 809 1
cpu 0 809 1
cpuacct 0 809 1
blkio 0 809 1
memory 0 809 1
devices 0 809 1
freezer 0 809 1
net_cls 0 809 1
perf_event 0 809 1
net_prio 0 809 1
hugetlb 0 809 1
pids 0 809 1
rdma 0 809 1
misc 0 809 1
debug 0 809 1

A count of 809 is reported for each controller, but only 109 should be
reported for most of them since each controller is enabled in only part
of the hierarchy. (Note that io depends on memcg, so its count should be
209.)

The number of cgroups using a controller is an important metric since
kernel memory is used for each cgroup, and some kernel operations scale
with the number of cgroups for some controllers (memory, io). So users
have an interest in minimizing/tracking the number of them.

Signed-off-by: T.J. Mercier <tjmercier@xxxxxxxxxx>

---
Changes from RFC:
Don't manually initialize the atomic counters to 0 since they are
kzalloced - Michal Koutny

Also return the CSS count for utility controllers instead of the cgroup
count - Michal Koutny

include/linux/cgroup-defs.h | 6 ++++++
kernel/cgroup/cgroup-v1.c | 8 ++++++--
kernel/cgroup/cgroup.c | 2 ++
3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index ea48c861cd36..bc1dbf7652c4 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -579,6 +579,12 @@ struct cgroup_root {
/* Number of cgroups in the hierarchy, used only for /proc/cgroups */
atomic_t nr_cgrps;

+ /*
+ * Number of cgroups using each controller. Includes online and zombies.
+ * Used only for /proc/cgroups.
+ */
+ atomic_t nr_css[CGROUP_SUBSYS_COUNT];
+
/* Hierarchy-specific flags */
unsigned int flags;

diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index b9dbf6bf2779..9bad59486c46 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -675,11 +675,15 @@ int proc_cgroupstats_show(struct seq_file *m, void *v)
* cgroup_mutex contention.
*/

- for_each_subsys(ss, i)
+ for_each_subsys(ss, i) {
+ int count = cgroup_on_dfl(&ss->root->cgrp) ?
+ atomic_read(&ss->root->nr_css[i]) : atomic_read(&ss->root->nr_cgrps);
+
seq_printf(m, "%s\t%d\t%d\t%d\n",
ss->legacy_name, ss->root->hierarchy_id,
- atomic_read(&ss->root->nr_cgrps),
+ count,
cgroup_ssid_enabled(i));
+ }

return 0;
}
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index e32b6972c478..1bacd7cf7551 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5362,6 +5362,7 @@ static void css_free_rwork_fn(struct work_struct *work)
ss->css_free(css);
cgroup_idr_remove(&ss->css_idr, id);
cgroup_put(cgrp);
+ atomic_dec(&ss->root->nr_css[ss->id]);

if (parent)
css_put(parent);
@@ -5504,6 +5505,7 @@ static int online_css(struct cgroup_subsys_state *css)
atomic_inc(&css->online_cnt);
if (css->parent)
atomic_inc(&css->parent->online_cnt);
+ atomic_inc(&ss->root->nr_css[ss->id]);
}
return ret;
}

base-commit: 6fbf71854e2ddea7c99397772fbbb3783bfe15b5
--
2.45.1.288.g0e0cd299f1-goog