Re: 2.6.29-rc1 does not boot

From: Rusty Russell
Date: Wed Jan 14 2009 - 19:54:37 EST


On Monday 12 January 2009 20:30:53 Ingo Molnar wrote:
> work_on_cpu() seems
> completely unsuited for any sort of set_cpus_allowed() replacement ...

That's harsh. set_cpus_allowed is *always* questionable in these cases. Sometimes it's harmless, and sometimes there was a risk that we could run on the wrong cpu.

The mistake was that work_on_cpu() should rely on the caller to ensure the CPU doesn't go away. It's a worse interface, but this reduces the number of *new* bugs, at least.

Subject: cpumask: don't try to get_online_cpus() in work_on_cpu.

This has caused more problems than it solved, with a pile of cpu
hotplug locking issues.

Followup patches will get_online_cpus() in callers that need it, but
if they don't do it they're no worse than before when they were using
set_cpus_allowed without locking.

Signed-off-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -991,8 +991,7 @@ static void do_work_for_cpu(struct work_
* @fn: the function to run
* @arg: the function arg
*
- * This will return -EINVAL in the cpu is not online, or the return value
- * of @fn otherwise.
+ * It is up to the caller to ensure that the cpu doesn't go offline.
*/
long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
{
@@ -1001,14 +1000,12 @@ long work_on_cpu(unsigned int cpu, long
INIT_WORK(&wfc.work, do_work_for_cpu);
wfc.fn = fn;
wfc.arg = arg;
- get_online_cpus();
if (unlikely(!cpu_online(cpu)))
wfc.ret = -EINVAL;
else {
schedule_work_on(cpu, &wfc.work);
flush_work(&wfc.work);
}
- put_online_cpus();

return wfc.ret;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/