Re: [PATCH] psi: Inherit parent cgroup psi enable state

From: Chuyi Zhou
Date: Mon Jul 29 2024 - 02:31:26 EST


Hello,

在 2024/7/29 12:45, K Prateek Nayak 写道:
Hello Chuyi,

On 7/29/2024 9:11 AM, Chuyi Zhou wrote:
Currently when a parent cgroup disables psi through cgroup.pressure, newly
created child cgroups do not inherit the psi state of the parent cgroup.

This patch tries to solve this issue. When a child cgroup is created, it
would inherit the psi enabled state of the parent in group_init().
Once the enable state is found to be false in the css_populate_dir(), the
{cpu, io, memory}.pressure files will be hidden using cgroup_file_show().

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
  kernel/cgroup/cgroup.c | 21 +++++++++++++++++++--
  kernel/sched/psi.c     |  4 ++--
  2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index c8e4b62b436a4..775fe528efcad 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1719,6 +1719,24 @@ static void css_clear_dir(struct cgroup_subsys_state *css)
      }
  }
+static int populate_psi_files(struct cgroup_subsys_state *css)
+{
+    struct cgroup *cgrp = css->cgroup;
+    int ret, i;
+
+    ret = cgroup_addrm_files(css, cgrp, cgroup_psi_files, true);
+    if (ret < 0)
+        return ret;
+
+    if (cgrp->psi && !cgrp->psi->enabled) {
+        for (i = 0; i < NR_PSI_RESOURCES; i++)
+            cgroup_file_show(&cgrp->psi_files[i], 0);
+    }
+
+    return ret;
+}
+
+
  /**
   * css_populate_dir - create subsys files in a cgroup directory
   * @css: target css
@@ -1742,8 +1760,7 @@ static int css_populate_dir(struct cgroup_subsys_state *css)
                  return ret;
              if (cgroup_psi_enabled()) {
-                ret = cgroup_addrm_files(css, cgrp,
-                             cgroup_psi_files, true);
+                ret = populate_psi_files(css);
                  if (ret < 0) {
                      cgroup_addrm_files(css, cgrp,
                                 cgroup_base_files, false);
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 020d58967d4e8..d0aa17b368819 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -180,7 +180,7 @@ static void group_init(struct psi_group *group)
  {
      int cpu;
-    group->enabled = true;
+    group->enabled = group->parent ? group->parent->enabled : true;

Since this is only the init path, if the user later enables PSI
accounting for a parent, should it not re-evaluate it for the groups
down the hierarchy?

Looking at "cgroup_pressure_write()", I could not spot it calling
"css_populate_dir()". Should it not walk the hierarchy and do a
"cgroup_file_show()" considering the changes in you patch?

(P.S. I'm not too familiar with this piece of code so please do let me
 know if I missed something obvious)

Perhaps my description in the commit log was not clear enough. This patch is intended to make child cgroups inherit the state of the parent node *during initialization*. The cgroup.pressure interface remains the same as before, only changing the enable state at the current level.

In production environments, the overhead of PSI could be significant on some machines (with many deep levels of cgroups). For certain tasks (such as /sys/fs/cgroup/offline/pod_xxx), we may not need to monitor their PSI metrics. With this patch, after disabling /sys/fs/cgroup/offline/cgroup.pressure, any subsequently deployed offline pods will not enable PSI. Although users can disable PSI by traversing all cgroups under /sys/fs/cgroup/offline/, this may not be convenient enough.

Thanks.