[PATCH v2] cgroup: Reorganize css_set_lock and kernfs path processing

From: Michal Koutný
Date: Wed Sep 28 2022 - 07:33:24 EST


The commit 74e4b956eb1c incorrectly wrapped kernfs_walk_and_get
(might_sleep) under css_set_lock (spinlock). css_set_lock is needed by
__cset_cgroup_from_root to ensure stable cset->cgrp_links but not for
kernfs_walk_and_get.

We only need to make sure that the returned root_cgrp won't be freed
under us. This is given in the case of global root because it is static
(cgrp_dfl_root.cgrp). When the root_cgrp is lower in the hierarchy, it
is pinned by cgroup_ns->root_cset (and `current` task cannot switch
namespace asynchronously so ns_proxy pins cgroup_ns).

(Note this reasoning won't hold for root cgroups in v1 hierarchies but
the path resolution works only with the default hierarchy.)

Fixes: 74e4b956eb1c: ("cgroup: Honor caller's cgroup NS when resolving path")
Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
Signed-off-by: Michal Koutný <mkoutny@xxxxxxxx>
---
kernel/cgroup/cgroup.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Hello.

v2: dropped changes around kernfs_path_from_node(), reworded commit
message

I realized the pinning with reference taking won't really work
generally. The code would get the reference within RCU read section, so
it'd have to be cgroup_get_live() and if that fails there's not much to
do.

So, instead of generalization, I only post special-cased patch that
fixes the introduced bug and doesn't touch the rest.

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c37b8265c0a3..ac71af8ef65c 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1392,11 +1392,16 @@ static void cgroup_destroy_root(struct cgroup_root *root)
cgroup_free_root(root);
}

+/*
+ * Returned cgroup is without refcount but it's valid as long as cset pins it.
+ */
static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
struct cgroup_root *root)
{
struct cgroup *res_cgroup = NULL;

+ lockdep_assert_held(&css_set_lock);
+
if (cset == &init_css_set) {
res_cgroup = &root->cgrp;
} else if (root == &cgrp_dfl_root) {
@@ -6673,8 +6678,8 @@ struct cgroup *cgroup_get_from_path(const char *path)

spin_lock_irq(&css_set_lock);
root_cgrp = current_cgns_cgroup_from_root(&cgrp_dfl_root);
- kn = kernfs_walk_and_get(root_cgrp->kn, path);
spin_unlock_irq(&css_set_lock);
+ kn = kernfs_walk_and_get(root_cgrp->kn, path);
if (!kn)
goto out;

--
2.37.3