[PATCH 3/3] sched/topology: Skip updating masks for non-online nodes

From: Srikar Dronamraju
Date: Thu May 20 2021 - 11:45:24 EST


Currently scheduler doesn't check if node is online before adding CPUs
to the node mask. However on some architectures, node distance is only
available for nodes that are online. Its not sure how much to rely on
the node distance, when one of the nodes is offline.

If said node distance is fake (since one of the nodes is offline) and
the actual node distance is different, then the cpumask of such nodes
when the nodes become becomes online will be wrong.

This can cause topology_span_sane to throw up a warning message and the
rest of the topology being not updated properly.

Resolve this by skipping update of cpumask for nodes that are not
online.

Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>
Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
Cc: Nathan Lynch <nathanl@xxxxxxxxxxxxx>
Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
Cc: Scott Cheloha <cheloha@xxxxxxxxxxxxx>
Cc: Gautham R Shenoy <ego@xxxxxxxxxxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxxx>
Cc: Geetika Moolchandani <Geetika.Moolchandani1@xxxxxxx>
Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@xxxxxxx>
Signed-off-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
---
kernel/sched/topology.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index ccb9aff59add..ba0555e83548 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1731,6 +1731,9 @@ void sched_init_numa(void)
sched_domains_numa_masks[i][j] = mask;

for_each_node(k) {
+ if (!node_online(j))
+ continue;
+
if (sched_debug() && (node_distance(j, k) != node_distance(k, j)))
sched_numa_warn("Node-distance not symmetric");

@@ -1793,6 +1796,9 @@ void sched_domains_numa_masks_set(unsigned int cpu)

for (i = 0; i < sched_domains_numa_levels; i++) {
for (j = 0; j < nr_node_ids; j++) {
+ if (!node_online(j))
+ continue;
+
if (node_distance(j, node) <= sched_domains_numa_distance[i])
cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
}
--
2.27.0