Re: [PATCH 1/4] 2.6.2-rc2-mm2 CPU Hotplug: cpu_active_map

From: Rusty Russell
Date: Mon Feb 02 2004 - 06:14:22 EST


In message <20040202100827.GA28870@xxxxxxx> you write:
>
> * Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:
>
> > D: When CPUs are going down, there is a time when cpu_online(cpu) is
> > D: false, but they are still scheduling and responding to interrupts
> > D: (we are migrating things off the CPU, shutting down per-cpu
> > D: threads, etc). It turns out that RCU cares about these CPUs, so
> > D: the decision was made to expose this mask (previously internal to x86,
> > D: and only used for IPIs).
>
> these kinds of problems could be avoided by making the CPU-off as much
> of an atomic operation as possible. The less atomic it is, the more
> kernel code is exposed to the transitional state - and since this is a
> rare situation it will always have quality problems. Is there any killer
> argument that makes it impossible to down a CPU atomically? Kernel
> threads can get their callbacks on other CPUs just fine.

Well, we could grab control of the machine, unconditionally move all
the threads and kill the cpu.

We'd need a lot of locks, but we could just use a bogolock (a-la
module.c's schedule a thread per cpu and tell them all to disable
interrupts) which has the same effect of grabbing every spinlock.

It means that kernel threads must not use smp_processor_id(), because
it might change under them (get_cpu() would protect from this). You'd
still need explicit shutdown code.

> Other tasks should not care. If the migrate-off operation is done
> 100% atomically then zero knowledge is needed by unrelated scheduler
> code about the act of disabling a CPU.

No, migration code still needs to check the cpu is still up after
you've slept.

I'll sleep on it and see if it still sounds like a good idea in the
morning 8)
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/