On Wed, 26 Jun 2019, subhra mazumdar wrote:Using cpumask_weight every time in select_idle_cpu to compute the no. of
Introduce a per-cpu variable to keep the number of HT siblings of a cpu.Why? The number of siblings is constant at least today unless you play
This will be used for quick lookup in select_idle_cpu to determine the
limits of search.
silly cpu hotplug games. A bit more justification for adding yet another
random storage would be appreciated.
Ok. The extra per-CPU optimization was done only for x86 as we cared about
This patch does it only for x86.# grep 'This patch' Documentation/process/submitting-patches.rst
IOW, we all know already that this is a patch and from the subject prefix
and the diffstat it's pretty obvious that this is x86 only.
So instead of documenting the obvious, please add proper context to justify
the change.
I will remove it+/* representing number of HT siblings of each CPU */Why does this need an export? No module has any reason to access this.
+DEFINE_PER_CPU_READ_MOSTLY(unsigned int, cpumask_weight_sibling);
+EXPORT_PER_CPU_SYMBOL(cpumask_weight_sibling);
/* representing HT and core siblings of each logical CPU */This only works for SMT=2, but fails to update the rest for SMT=4.
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
@@ -520,6 +524,8 @@ void set_cpu_sibling_map(int cpu)
if (!has_mp) {
cpumask_set_cpu(cpu, topology_sibling_cpumask(cpu));
+ per_cpu(cpumask_weight_sibling, cpu) =
+ cpumask_weight(topology_sibling_cpumask(cpu));
cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
cpumask_set_cpu(cpu, topology_core_cpumask(cpu));
c->booted_cores = 1;
@@ -529,8 +535,12 @@ void set_cpu_sibling_map(int cpu)
for_each_cpu(i, cpu_sibling_setup_mask) {
o = &cpu_data(i);
- if ((i == cpu) || (has_smt && match_smt(c, o)))
+ if ((i == cpu) || (has_smt && match_smt(c, o))) {
link_mask(topology_sibling_cpumask, cpu, i);
+ threads = cpumask_weight(topology_sibling_cpumask(cpu));
+ per_cpu(cpumask_weight_sibling, cpu) = threads;
+ per_cpu(cpumask_weight_sibling, i) = threads;
@@ -1482,6 +1494,8 @@ static void remove_siblinginfo(int cpu)While remove does the right thing.
for_each_cpu(sibling, topology_core_cpumask(cpu)) {
cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
+ per_cpu(cpumask_weight_sibling, sibling) =
+ cpumask_weight(topology_sibling_cpumask(sibling));
Thanks,
tglx