Re: [PATCH 1/2] cgroup: Fix incorrect warning from cgroup_apply_control_disable()

From: Waiman Long
Date: Mon Sep 13 2021 - 14:35:11 EST


On 9/13/21 2:05 PM, Tejun Heo wrote:
Hello,

On Thu, Sep 09, 2021 at 10:42:55PM -0400, Waiman Long wrote:
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 881ce1470beb..e31bca9fcd46 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3140,7 +3140,16 @@ static void cgroup_apply_control_disable(struct cgroup *cgrp)
if (!css)
continue;
- WARN_ON_ONCE(percpu_ref_is_dying(&css->refcnt));
+ /*
+ * A kill_css() might have been called previously, but
+ * the css may still linger for a while before being
+ * removed. Skip it in this case.
+ */
+ if (percpu_ref_is_dying(&css->refcnt)) {
+ WARN_ON_ONCE(css->parent &&
+ cgroup_ss_mask(dsct) & (1 << ss->id));
+ continue;
+ }
This warning did help me catch some gnarly bugs. Any chance we can keep it
for normal cases and elide it just for remounting?

The problem with percpu_ref_is_dying() is the fact that it becomes true after percpu_ref_exit() is called in css_free_rwork_fn() which has an RCU delay. If you want to catch the fact that kill_css() has been called, we can check the CSS_DYING flag which is set in kill_css() by commit 33c35aa481786 ("cgroup: Prevent kill_css() from being called more than once"). Will that be an acceptable alternative?

Cheers,
Longman