toplogy_span_sane() has an O(N^2) algorithm that takes an inordinate
amount of time on systems with a large number of cpus.
The first patch in this series replaces the algorithm used with a O(N)
method that should exactly duplicate the previous code's results.
The second patch simplifies the first, taking a similar amount of time
to run, but potentially has different results than previous code under
situations believed to not truly exist, like a CPU not being included
in its own span.
Version 1:
* Original patch
Version 2:
* Adopted simplifications from K Prateek Nayak,and fixed use of
num_possible_cpus().
Version 3:
* Undid the simplifications from version 2 when noticed that results
could differ from original code; kept num_possible_cpus() fix.
Version 4:
* Turned the patch into a series of 2, the second re-introduces the
simplifications, and includes further simplification suggested by
Valentin Schneider in the discussion for Version 2.
Steve Wahl (2):
sched/topology: improve topology_span_sane speed
sched/topology: Refinement to topology_span_sane speedup
kernel/sched/topology.c | 73 +++++++++++++++++++++++++++--------------
1 file changed, 48 insertions(+), 25 deletions(-)