Both options would work. Increment+contrinue instead of goto would be+ /*I think if you increment and then continue here you might save the extra
+ * Examine topology from all cpu's point of views to detect the lowest
+ * sched_domain_topology_level where a highest capacity cpu is visible
+ * to everyone.
+ */
+ for_each_cpu(i, cpu_map) {
+ unsigned long max_capacity = arch_scale_cpu_capacity(NULL, i);
+ int tl_id = 0;
+
+ for_each_sd_topology(tl) {
+ if (tl_id < asym_level)
+ goto next_level;
+
branch. I didn't look at any disassembly though to verify the generated
code.
I wonder if we can introduce for_each_sd_topology_from(tl, starting_level)
so that you can start searching from a provided level - which will make this
skipping logic unnecessary? So the code will look like
ÂÂÂ ÂÂÂ ÂÂÂ for_each_sd_topology_from(tl, asymc_level) {
ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ...
ÂÂÂ ÂÂÂ ÂÂÂ }
slightly less readable I think since we would still have the increment
at the end of the loop, but easy to do. Introducing
for_each_sd_topology_from() improve things too, but I wonder if it is
worth it.
It does increase the cost of things like hotplug slightly and@@ -1647,18 +1707,27 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attOr maybe this is not a hot path and we don't care that much about optimizing
struct s_data d;
struct rq *rq = NULL;
int i, ret = -ENOMEM;
+ struct sched_domain_topology_level *tl_asym;
alloc_state = __visit_domain_allocation_hell(&d, cpu_map);
if (alloc_state != sa_rootdomain)
goto error;
+ tl_asym = asym_cpu_capacity_level(cpu_map);
+
the search since you call it unconditionally here even for systems that
don't care?
repartitioning of root_domains a slightly but I don't see how we can
avoid it if we want generic code to set this flag. If the costs are not
acceptable I think the only option is to make the detection architecture
specific.
In any case, AFAIK rebuilding the sched_domain hierarchy shouldn't be a
normal and common thing to do. If checking for the flag is not
acceptable on SMP-only architectures, I can move it under arch/arm[,64]
although it is not as clean.