[PATCH 00/11] x86: cleanup early per cpu variables/accesses v5-folded

From: Mike Travis
Date: Fri Apr 25 2008 - 20:16:33 EST



1/11: Increase the limit of NR_CPUS to 4096 and introduce a boolean
called "MAXSMP" which when set (e.g. "allyesconfig"), will set
NR_CPUS = 4096 and NODES_SHIFT = 9 (512). Changed max setting
for NODES_SHIFT from 15 to 9 to accurately reflect the real limit.

2/11: Introduce a new PER_CPU macro called "EARLY_PER_CPU". This is
used by some per_cpu variables that are initialized and accessed
before there are per_cpu areas allocated.

Add a flag "arch_provides_topology_pointers" that indicates pointers
to topology cpumask_t maps are available. Otherwise, use the function
returning the cpumask_t value. This is useful if cpumask_t set size
is very large to avoid copying data on to/off of the stack.

3/11: Restore the nodenumber field in the x86_64 pda. This field is slightly
different than the x86_cpu_to_node_map mainly because it's a static
indication of which node the cpu is on while the cpu to node map is a
dyanamic mapping that may get reset if the cpu goes offline. This also
simplifies the numa_node_id() macro.

4/11: Consolidate node_to_cpumask operations and remove the 256k
byte node_to_cpumask_map. This is done by allocating the
node_to_cpumask_map array after the number of possible
nodes (nr_node_ids) is known.

5/11: Replace usages of MAX_NUMNODES with nr_node_ids in kernel/sched.c,
where appropriate. This saves some allocated space as well as many
wasted cycles going through node entries that are non-existent.

6/11: Changed some global definitions in drivers/base/cpu.c to static.

7/11: Remove 544k bytes from the kernel by removing the boot_cpu_pda
array from the data section and allocating it during startup.

8/11: Increase performance for systems with large count NR_CPUS by
limiting the range of the cpumask operators that loop over
the bits in a cpumask_t variable. This removes a large amount
of wasted cpu cycles.

9/11: Change references from for_each_cpu_mask to for_each_cpu_mask_ptr
in all cases for x86_64 and generic code.

10/11: Change references from next_cpu to next_cpu_nr (or for_each_cpu_mask_ptr
if applicable), in all cases for x86_64 and generic code.

11/11: Pass reference to cpumask variable in net/sunrpc/svc.c


For inclusion into sched-devel/latest tree.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git


Signed-off-by: Mike Travis <travis@xxxxxxx>
---
v5-folded:
Folded in patches 7 - 11 above.
Fixed some warnings in NONUMA config build.

v4-folded:
Folded in Kconfig changes to increase NR_CPU limit to 4096 and add new
config option MAXSMP.

v3-folded:
Folded in follow on "fix" patches to consolidate changes into one place.
Includes change to drivers/base/topology.c to fix s390 build error.
Includes change to fix preemption warning when numa_node_id is used.
checkpatch.pl errors/warnings checked and fixed where possible.

v2: remerged PATCH 2/2 with latest x86.git/latest and sched-devel/latest,
rebuilt and retested 4k-akpm2, 4k-defconfig, nonuma, & nosmp configs.

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/