[PATCH v2 1/2] rcu: don't bind offline cpu

From: KOSAKI Motohiro
Date: Thu May 19 2011 - 02:06:31 EST


Hi Paul,

I've made new patch. Is this acceptable to you?


==============================================================
While discussing cpuset_cpus_allowed_fallback() fix, we've found
rcu subsystem don't use kthread_bind() correctly.

The detail is, typical subsystem wake up a kthread at CPU_ONLINE
notifier (ie. _after_ cpu is onlined), but rcu subsystem wake up
a kthread at CPU_UP_PREPARE notifier (ie. _before_ cpu is onlined).
Because otherwise RCU grace periods in CPU_ONLINE notifiers will
never complete.

This makes big different result. if we fix cpuset_cpus_allowed_fallback(),
sched load balancer run before scheduler smp initialize and makes
kernel crash. (see below)

kernel_init();
smp_init();
_cpu_up();
__cpu_notify(CPU_UP_PREPARE | mod, hcpu, -1, &nr_calls);
rcu_cpu_notify();
rcu_online_kthreads();
rcu_spawn_one_node_kthread();
wake_up_process();
try_to_wake_up();
select_task_rq();
select_fallback_rq();
cpuset_cpus_allowed_fallback();
/* here the rcu_thread's cpus_allowed will
* be set to cpu_possible_mask, but now
* we only have the boot cpu online, so it
* will run on the boot cpu p->rt.nr_cpus_allowed
* will be set to cpumask_weight(cpu_possible_mask);
*/
sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
__sched_setscheduler();
check_class_changed();
p->sched_class->switched_to(rq, p); /* rt_class */
push_rt_task();
find_lock_lowest_rq();
find_lowest_rq();
/* crash here because local_cpu_mask is uninitialized */

The right way is, explicit two phase cpu bindings (1) bind boot
(or any other online) cpu at CPU_UP_PREPARE (2) bind correct
target cpu at CPU_ONLINE. This patch does it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Yong Zhang <yong.zhang0@xxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
---
kernel/rcutree.c | 11 +++++++++--
1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 5616b17..924b0cd 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1645,6 +1645,7 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
{
struct sched_param sp;
struct task_struct *t;
+ int bound_cpu;

if (!rcu_kthreads_spawnable ||
per_cpu(rcu_cpu_kthread_task, cpu) != NULL)
@@ -1652,8 +1653,14 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
t = kthread_create(rcu_cpu_kthread, (void *)(long)cpu, "rcuc%d", cpu);
if (IS_ERR(t))
return PTR_ERR(t);
- kthread_bind(t, cpu);
- per_cpu(rcu_cpu_kthread_cpu, cpu) = cpu;
+ /*
+ * The target cpu isn't online yet and can't be bound the rcuc kthread.
+ * Thus we bind it to another online cpu temporary.
+ * rcu_cpu_kthread_should_stop() rebind it to target cpu later.
+ */
+ bound_cpu = cpumask_any(cpu_online_mask);
+ kthread_bind(t, bound_cpu);
+ per_cpu(rcu_cpu_kthread_cpu, cpu) = bound_cpu;
WARN_ON_ONCE(per_cpu(rcu_cpu_kthread_task, cpu) != NULL);
per_cpu(rcu_cpu_kthread_task, cpu) = t;
wake_up_process(t);
--
1.7.3.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/