On Tue, Jan 17, 2012 at 9:55 PM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxx> wrote:
(1/17/12 9:07 PM), Venkatesh Pallipadi wrote:Kernel's notion of possible cpus (from include/linux/cpumask.h)
* cpu_possible_mask- has bit 'cpu' set iff cpu is populatable
* The cpu_possible_mask is fixed at boot time, as the set of CPU id's
* that it is possible might ever be plugged in at anytime during the
* life of that system boot.
#define num_possible_cpus() cpumask_weight(cpu_possible_mask)
and on x86 cpumask_weight() calls hweight64 and hweight64 (on older kernels
and systems with !X86_FEATURE_POPCNT) or a popcnt based alternative.
i.e, We needlessly go through this mask based calculation everytime
num_possible_cpus() is called.
The problem is there with cpu_online_mask() as well, which is fixed value at
boot time in !CONFIG_HOTPLUG_CPU case and should not change that often even
in HOTPLUG case.
Though most of the callers of these two routines are init time (with few
exceptions of runtime calls), it is cleaner to use variables
and not go through this repeated mask based calculation.
Signed-off-by: Venkatesh Pallipadi<venki@xxxxxxxxxx>
---
include/linux/cpumask.h | 8 ++++++--
kernel/cpu.c | 9 +++++++++
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 4f7a632..2eb04dd 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -80,9 +80,13 @@ extern const struct cpumask *const cpu_online_mask;
extern const struct cpumask *const cpu_present_mask;
extern const struct cpumask *const cpu_active_mask;
+extern int nr_online_cpus;
+
#if NR_CPUS> 1
-#define num_online_cpus() cpumask_weight(cpu_online_mask)
-#define num_possible_cpus() cpumask_weight(cpu_possible_mask)
+
+#define num_online_cpus() (nr_online_cpus)
+#define num_possible_cpus() (nr_cpu_ids)
nr_cpu_ids mean maximum cpu id of cpus. if cpu id are sparse, maximum id
doesn't match number of cpus.
Yes. But will it be sparse in any arch? I saw some of the users of
num_possible_cpus() doing things like allocating a buffer for that
size and then indexing it using get_cpu(). So, I thought it would be
better to use the existing nr_cpu_ids instead of inventing another
variable. If indeed any arch is depending on this being sparse, we can
have a new variable similar to num_possible_cpus and also audit all
users of num_possible_cpus to see whether they should be using
nr_cpu_ids instead.