SMP Panic caused by [PATCH] sched: consolidate sched domains

From: James Bottomley
Date: Sun Aug 29 2004 - 08:42:21 EST


This patch causes an immediate panic when the secondary processors come
on-line because sd->next is NULL.

The fix is to use cpu_possible_map instead of nodemask (which expands,
probably erroneously, to cpu_online_map in the non-numa case).

Any use of cpu_online_map in initialisation code is almost invariably
wrong, so please don't do it in future.

I know I'm sounding like a broken record, but it would be a lot easier
to spot mistakes like this immediately if every arch used the hotplug
paths to bring SMP up.

Anyway, the attached fixes our panic.

James

===== kernel/sched.c 1.329 vs edited =====
--- 1.329/kernel/sched.c 2004-08-24 02:08:09 -07:00
+++ edited/kernel/sched.c 2004-08-29 06:17:26 -07:00
@@ -3756,7 +3756,7 @@
sd = &per_cpu(phys_domains, i);
group = cpu_to_phys_group(i);
*sd = SD_CPU_INIT;
- sd->span = nodemask;
+ sd->span = cpu_possible_map;
sd->parent = p;
sd->groups = &sched_group_phys[group];

@@ -3790,7 +3790,7 @@
if (cpus_empty(nodemask))
continue;

- init_sched_build_groups(sched_group_phys, nodemask,
+ init_sched_build_groups(sched_group_phys, cpu_possible_map,
&cpu_to_phys_group);
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/