Re: sched domains bringup race?

From: Nick Piggin
Date: Mon Jul 19 2004 - 02:32:49 EST


Keshavamurthy Anil S wrote:
On Thu, Jul 15, 2004 at 07:13:46PM -0700, Dave Hansen wrote:

I keep getting oopses for the non-boot CPU in find_busiest_group(). This occurs the first time that the CPU goes idle. Those groups are set
up in sched_init_smp(), which is called after smp_init():

static int init(void * unused)
{
...
fixup_cpu_present_map();
smp_init();
sched_init_smp();

But, the idle threads for the secondary CPUs are initialized in
smp_init(). So, what happens when a CPU tries to schedule (using sched
domains) before sched_init_smp() completes? I think it goes boom! :)

Anyway, I was thinking that we should just hold the runqueue lock on the
non-boot CPUs until the sched domain init code is done. Does that sound
feasible?


Even on my system which is Intel 865 chipset (P4 with HT enabled system) I see a bug check somewhere in the schedular_tick during boot.
However if I move the sched_init_smp() after do_basic_setup() the
kernel boots without any problem. Any clue here?

static int init(void * unused)
{
...
fixup_cpu_present_map();
smp_init();
populate_rootfs();
do_basic_setup();
sched_init_smp();

There shouldn't be any problem doing that if we have to, obviously we
need to know why. Is it possible that cpu_sibling_map, or one of the
CPU masks isn't set up correctly at the time of the call?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/