Re: [PATCH] dl_server: Reset DL server params when rd changes
From: Waiman Long
Date: Sat Nov 09 2024 - 13:18:32 EST
On 11/8/24 10:30 PM, Waiman Long wrote:
I have the patchset to enforce that rebuild_sched_domains_locked()
will only be called at most once per cpuset operation.
By adding some debug code to further study the null total_bw issue
when cpuset.cpus.partition is being changed, I found that eliminating
the redundant rebuild_sched_domains_locked() reduced the chance of
hitting null total_bw, it did not eliminate it. By running my cpuset
test script, I hit 250 cases of null total_bw with the v6.12-rc6
kernel. With my new cpuset patch applied, it reduces it to 120 cases
of null total_bw.
I will try to look further for the exact condition that triggers null
total_bw generation.
After further testing, the 120 cases of null total_bw can be classified
into the following 3 categories.
1) 51 cases when an isolated partition with isolated CPUs is created.
Isolated CPU is not subjected to scheduling and so a total_bw of 0 is
fine and not really a problem.
2) 67 cases when a nested partitions are being removed (A1 - A2). There
is probably caused by some kind of race condtion. If I insert an
artifical delay between the removal of A2 and A1, total_bw is fine. If
there is no delay, I can see a null total_bw. That shouldn't really a
problem in practice, though we may still need to figure out why.
2) Two cases where null total_bw is seen when a new partition is created
by moving all the CPUs in the parent cgroup into its partition and the
parent becomes a null partition with no CPU. The following example
illustrates the steps.
#!/bin/bash
CGRP=/sys/fs/cgroup
cd $CGRP
echo +cpuset > cgroup.subtree_control
mkdir A1
cd A1
echo 0-1 > cpuset.cpus
echo root > cpuset.cpus.partition
echo "A1 partition"
echo +cpuset > cgroup.subtree_control
mkdir A2
cd A2
echo 0-1 > cpuset.cpus
echo root > cpuset.cpus.partition
echo "A2 partition"
cd ..
echo "Remove A2"
rmdir A2
cd ..
echo "Remove A1"
rmdir A1
In this corner case, there is actually no change in the set of sched
domains. In this case, the sched domain set of CPUs 0-1 is being moved
from partition A1 to A2 and vice versa in the removal of A2. This is
similar to calling rebuild_sched_domains_locked() twice with the same
input. I believe that is the condition that causes null total_bw.
Now the question is why the deadline code behaves this way. It is
probably a bug that needs to be addressed.
Cheers,
Longman