Re: [External] Re: [PATCH 2/3] sched/fair: Ignore isolated cpus in update_numa_stat
From: Chuyi Zhou
Date: Wed Dec 18 2024 - 02:20:03 EST
Hello Prateek
在 2024/12/18 14:26, K Prateek Nayak 写道:
Hello Chuyi,
On 12/16/2024 5:53 PM, Chuyi Zhou wrote:
[..snip..] @@ -2125,6 +2125,11 @@ static void update_numa_stats(struct
task_numa_env *env,
for_each_cpu(cpu, cpumask_of_node(nid)) {
Looking at sched_init_domains(), we only build sched domains only for
active CPUs in housekeeping_cpumask(HK_TYPE_DOMAIN) so similar to the
question on Patch 3, can we get away with just modifying this outer loop
to:
for_each_cpu_and(cpu, cpumask_of_node(nid),
housekeeping_cpumask(HK_TYPE_DOMAIN)) {
...
}
Thoughts?
We now have two ways of using isolated CPUs.
One is the isolcpus= kernel command line. 'isolcpus=0-7' would
exclude 0-7 cpus from housekeeping_cpumask without further restrictions
on tasks' cpumasks. A typical case that could lead to errors is when a
task's CPU mask covers all the CPUs in the system, causing the task to
potentially be migrated to or woken up on CPUs 0-7. Commit 23d04d8
resolves a similar issue in task wakeup.
The other is the isolated cpuset partition:
mkdir blue
echo 5-8 > blue/cpuset.cpus
echo "isolated" > blue/cpuset.cpus.partition
The 5-8 cpus now is also a isolated domain, but 5-8 would not be
excluded from housekeeping_cpumask. The tasks' cpumask in blue cgroup
would be restricted to 5-8.
I thinks in update_numa_stats(), only using
housekeeping_cpumask(HK_TYPE_DOMAIN) is not enough because it cannot
skip those isolated cpuset partitions.
Thanks.