Re: [PATCH] sched: fix constructing the span cpu mask of scheddomain

From: Peter Zijlstra
Date: Tue May 10 2011 - 04:29:15 EST


On Thu, 2011-05-05 at 20:53 +0800, Hillf Danton wrote:
> For a given node, when constructing the cpumask for its sched_domain
> to span, if there is no best node available after searching, further
> efforts could be saved, based on small change in the return value of
> find_next_best_node().
>
> Signed-off-by: Hillf Danton <dhillf@xxxxxxxxx>
> ---
>
> --- a/kernel/sched.c 2011-04-27 11:48:50.000000000 +0800
> +++ b/kernel/sched.c 2011-05-05 20:44:52.000000000 +0800
> @@ -6787,7 +6787,7 @@ init_sched_build_groups(const struct cpu
> */
> static int find_next_best_node(int node, nodemask_t *used_nodes)
> {
> - int i, n, val, min_val, best_node = 0;
> + int i, n, val, min_val, best_node = -1;
>
> min_val = INT_MAX;
>
> @@ -6811,7 +6811,8 @@ static int find_next_best_node(int node,
> }
> }
>
> - node_set(best_node, *used_nodes);
> + if (best_node != -1)
> + node_set(best_node, *used_nodes);
> return best_node;
> }
>
> @@ -6837,7 +6838,8 @@ static void sched_domain_node_span(int n
>
> for (i = 1; i < SD_NODES_PER_DOMAIN; i++) {
> int next_node = find_next_best_node(node, &used_nodes);
> -
> + if (next_node < 0)
> + break;
> cpumask_or(span, span, cpumask_of_node(next_node));
> }
> }


If you're interested in this area of the scheduler, you might want to
have a poke at:

http://marc.info/?l=linux-kernel&m=130218515520540

That tries to rewrite the CONFIG_NUMA support for the sched_domain stuff
to create domains based on the node_distance() to better reflect the
actual machine topology.

As stated, that patch is currently very broken, mostly because the
topologies encountered don't map to non-overlapping trees. I've not yet
come up with how to deal with that, but we sure need to do something
like that, the current group 16 nodes and a group of all simply doesn't
work well for today's machines now that NUMA is both common and the
inter-node latencies are more relevant.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/