Hello Chuyi,
On 12/23/2024 6:28 PM, Chuyi Zhou wrote:
在 2024/12/18 14:21, K Prateek Nayak 写道:
Hello Chuyi,
On 12/16/2024 5:53 PM, Chuyi Zhou wrote:
[..snip..]
@@ -2081,6 +2081,12 @@ numa_type numa_classify(unsigned int imbalance_pct,
return node_fully_busy;
}
+static inline bool numa_migrate_test_cpu(struct task_struct *p, int cpu)
+{
+ return cpumask_test_cpu(cpu, p->cpus_ptr) &&
+ housekeeping_cpu(cpu, HK_TYPE_DOMAIN);
+}
+
#ifdef CONFIG_SCHED_SMT
/* Forward declarations of select_idle_sibling helpers */
static inline bool test_idle_cores(int cpu);
@@ -2168,7 +2174,7 @@ static void task_numa_assign(struct task_numa_env *env,
/* Find alternative idle CPU. */
for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), start + 1) {
Can we just do:
for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid), housekeeping_cpumask(HK_TYPE_DOMAIN)) {
...
}
and avoid adding numa_migrate_test_cpu(). Thoughts?
Make sense, but now there doesn't seem to be an API like for_each_cpu_wrap_and().
Do you think the following is better?
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 855df103f4dd..4792ef672738 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2167,9 +2167,9 @@ static void task_numa_assign(struct task_numa_env *env,
int start = env->dst_cpu;
/* Find alternative idle CPU. */
- for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), start + 1) {
+ for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid), housekeeping_cpumask(HK_TYPE_DOMAIN)) {
if (cpu == env->best_cpu || !idle_cpu(cpu) ||
"start" is set to "env->dst_cpu" is already taken care here with the
first comparison.
- !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
+ cpu == start || !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
continue;
}
I think the for_each_cpu_wrap() was used to reduce contention for xchg
operation below. Perhaps we can have a per-cpu temporary mask (like
load_balance_mask) if we want to reduce the xchg contention and break
this into cpumask_and() + for_each_cpu_wrap() steps. I'm not sure if
any of the existing masks (load_balance_mask, select_rq_mask,
should_we_balance_tmpmask) can be safely reused. Otherwise, perhaps we
can make a case for for_each_cpu_and_wrap() with this use case.