Re: race condition in schedule_on_each_cpu()

From: weiqi@xxxxxxxxxxxxxx
Date: Sat Jun 08 2013 - 07:30:46 EST


Hello Tejun Heo,

I've backported the schedule_on_each_cpu() "direct excution" patch on 3.0.30-rt50,
and It fixed my problem.

attachment is the effective patch.

However, I do not understand why machine1 can expose problem, but machine2 not.

I guess, because it's rt-kernel's preempt level related, so , is this difference due to cpu performance?

How do you think about this ?

Thank you~ diff -up linux-3.0.30-rt50/kernel/workqueue.c.bak linux-3.0.30-rt50/kernel/workqueue.c
--- linux-3.0.30-rt50/kernel/workqueue.c.bak 2013-06-08 19:09:06.801059232 +0800
+++ linux-3.0.30-rt50/kernel/workqueue.c 2013-06-08 19:09:15.680069626 +0800
@@ -1922,6 +1922,7 @@ static int worker_thread(void *__worker)

/* tell the scheduler that this is a workqueue worker */
worker->task->flags |= PF_WQ_WORKER;
+ smp_mb();
woke_up:
spin_lock_irq(&gcwq->lock);

@@ -2736,6 +2737,7 @@ EXPORT_SYMBOL(schedule_delayed_work_on);
int schedule_on_each_cpu(work_func_t func)
{
int cpu;
+ int orig = -1;
struct work_struct __percpu *works;

works = alloc_percpu(struct work_struct);
@@ -2744,13 +2746,20 @@ int schedule_on_each_cpu(work_func_t fun

get_online_cpus();

+ if(current->flags & PF_WQ_WORKER)
+ orig = raw_smp_processor_id();
+
for_each_online_cpu(cpu) {
struct work_struct *work = per_cpu_ptr(works, cpu);

INIT_WORK(work, func);
- schedule_work_on(cpu, work);
+ if(cpu != orig)
+ schedule_work_on(cpu, work);
}

+ if (orig >= 0)
+ func(per_cpu_ptr(works,orig));
+
for_each_online_cpu(cpu)
flush_work(per_cpu_ptr(works, cpu));