On an SMP system, if the boot cpu calls reschedule_idle before the
secondary cpus have all called init_idle, we hit the following code
in reschedule_idle:
...
for (i = 0; i < smp_num_cpus; i++) {
cpu = cpu_logical_map(i);
if (!can_schedule(p, cpu))
continue;
tsk = cpu_curr(cpu);
...
Because init_idle hasn't run for all cpus yet, the expression
"cpu_curr(cpu)"
gives us NULL. When we derefernce an offset of 0x50 off tsk a few
nanoseconds later we panic, complaining vaddr 0x00000050 is invalid.
This is more likely to happen on larger systems, but could happen on any
SMP system (especially if I enable serial console, which really slows down
the secondary procs doing printk ;-) ). If you want to see the panic /
analysis,
I can send it.
Thanks to Alan Cox & Andrew Morton for showing me how to serialise the
cpus to make the panic legible. The following patch holds back the boot
cpu at the end of smp_init until all the secondarys have done init_idle:
--- virgin-2.4.10/kernel/sched.c Mon Sep 17 23:03:09 2001
+++ linux-2.4.10/kernel/sched.c Sat Sep 29 19:57:10 2001
@@ -1309,6 +1309,8 @@
atomic_inc(¤t->files->count);
}
+extern volatile unsigned long wait_init_idle;
+
void __init init_idle(void)
{
struct schedule_data * sched_data;
@@ -1321,6 +1323,7 @@
}
sched_data->curr = current;
sched_data->last_schedule = get_cycles();
+ clear_bit(current->processor, &wait_init_idle);
}
extern void init_timervecs (void);
--- virgin-2.4.10/init/main.c Thu Sep 20 21:02:01 2001
+++ linux-2.4.10/init/main.c Sat Sep 29 21:11:52 2001
@@ -477,6 +477,8 @@
extern void setup_arch(char **);
extern void cpu_idle(void);
+volatile unsigned long wait_init_idle = 0UL;
+
#ifndef CONFIG_SMP
#ifdef CONFIG_X86_IO_APIC
@@ -490,13 +492,25 @@
#else
+
/* Called by boot processor to activate the rest. */
static void __init smp_init(void)
{
/* Get other processors into their bootup holding patterns. */
smp_boot_cpus();
+ wait_init_idle = cpu_online_map;
+ clear_bit(current->processor, &wait_init_idle); /* Don't wait on me! */
+ printk("Waiting on wait_init_idle (map = 0x%lx)\n", wait_init_idle);
smp_threads_ready=1;
smp_commence();
+
+ /* Wait for the other cpus to set up their idle processes */
+ while (1) {
+ if (!wait_init_idle)
+ break;
+ rep_nop();
+ }
+ printk("All processors have done init_idle\n");
}
#endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sun Sep 30 2001 - 21:01:11 EST