Re: [PATCH v2 2/3] sched/fair: Ignore isolated cpus in update_numa_stat

From: Chuyi Zhou
Date: Wed Jan 08 2025 - 05:48:05 EST


Hello Waiman,

在 2025/1/8 02:39, Waiman Long 写道:

On 1/3/25 1:59 AM, Chuyi Zhou wrote:
Now update_numa_stats() iterates each cpu in a node to gather load
information for the node and attempts to find the idle cpu as a candidate
best_cpu within the node.

In update_numa_stats() we should take into account the scheduling domain.
This is because the "isolcpus" kernel command line option and cpuset iso-
late partitions can remove CPUs from load balance. Similar to task wakeup
and periodic load balancing, we should not involve isolated CPUs in NUMA
balancing. When gathering load information for nodes, we need to ignore the
load of isolated CPUs. This change also avoids selecting an isolated CPU
as the idle_cpu.

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
  kernel/sched/fair.c | 7 +++++--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f544012b9320..a0139659fe7a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2125,6 +2125,11 @@ static void update_numa_stats(struct task_numa_env *env,
      for_each_cpu(cpu, cpumask_of_node(nid)) {
          struct rq *rq = cpu_rq(cpu);
+        /* skip isolated cpus' load */
+        if (!rcu_dereference(rq->sd))
+            continue;
+
+        ns->weight++;
          ns->load += cpu_load(rq);
          ns->runnable += cpu_runnable(rq);
          ns->util += cpu_util_cfs(cpu);
@@ -2144,8 +2149,6 @@ static void update_numa_stats(struct task_numa_env *env,
      }
      rcu_read_unlock();
-    ns->weight = cpumask_weight(cpumask_of_node(nid));
-
      ns->node_type = numa_classify(env->imbalance_pct, ns);
      if (idle_core >= 0)

You should initalize ns->weight to 0 first before iteration to prevent pre-existing ns->weight value from corrupting the result.

Cheers,
Longman


Thanks for your review.

We have already memset ns to 0 before the start of update_numa_stats(), so I think it should be okay here.