On Tue, Oct 08, 2024 at 10:02:23AM -0400, Waiman Long wrote:
On 10/7/24 11:28 AM, Kuan-Wei Chiu wrote:Thanks for your feedback. I appreciate your point regarding the implicit
Improve the efficiency of calculating the total number of schedulingYou are taking the implicit assumption that a boolean true is casted to int
domains by using the updated uf_union function, which now returns a
boolean to indicate if a merge occurred. Previously, an additional loop
was needed to count root nodes for distinct groups. With this change,
each successful merge reduces the domain count (ndoms) directly,
eliminating the need for the final loop and enhancing performance.
Signed-off-by: Kuan-Wei Chiu <visitorckw@xxxxxxxxx>
---
Note: Tested with test_cpuset_prs.sh
Side note: I know this optimization provides limited efficiency
improvements in this case, but since the union-find code is in the
library and other users may need group counting in the future, and
the required code change is minimal, I think it's still worthwhile.
kernel/cgroup/cpuset.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a4dd285cdf39..5e9301550d43 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -817,6 +817,8 @@ static int generate_sched_domains(cpumask_var_t **domains,
if (root_load_balance && (csn == 1))
goto single_root_domain;
+ ndoms = csn;
+
for (i = 0; i < csn; i++)
uf_node_init(&csa[i]->node);
@@ -829,17 +831,11 @@ static int generate_sched_domains(cpumask_var_t **domains,
* partition root cpusets.
*/
WARN_ON_ONCE(cgrpv2);
- uf_union(&csa[i]->node, &csa[j]->node);
+ ndoms -= uf_union(&csa[i]->node, &csa[j]->node);
1. That is the usual practice, but it is not part of the C standard itself
though it is for C++. I will be more comfortable with the "if (cond)
ndoms++" form. It will also be more clear.
casting of boolean values. And changing the code to:
if (uf_union(&csa[i]->node, &csa[j]->node))
--ndoms;
would also enhance clarity and readability.
Would you like me to resend v3? I'm asking because I suspect Tejun may
want to see more user cases before considering such optimizations.