Re: -next: Nov 12 - kernel BUG at kernel/sched.c:7359!

From: Sachin Sant
Date: Fri Nov 13 2009 - 06:44:32 EST


Peter Zijlstra wrote:
Well, it boots for me, but then, I've not been able to reproduce any
issues anyway :/

/me goes try a PREEMPT=n kernel, since that is what Mike reports boot
funnies with..
With the suggested changes against -next the machine boots fine.
After multiple runs of hackbenck,kernbench,cpu_hotplug tests the
machine is still up and running. So at this point all is well.
I will continue to monitor the box for a while..

I just picked up the changes made to kernel/sched.c. Have attached
the changes here.

Thanks for all your help.

Thanks
-Sachin

--

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

diff -Naurp a/kernel/sched.c b/kernel/sched.c
--- a/kernel/sched.c 2009-11-13 16:53:19.000000000 +0530
+++ b/kernel/sched.c 2009-11-13 16:50:47.000000000 +0530
@@ -2372,13 +2372,22 @@ static int try_to_wake_up(struct task_st
if (task_contributes_to_load(p))
rq->nr_uninterruptible--;
p->state = TASK_WAKING;
- task_rq_unlock(rq, &flags);
+ __task_rq_unlock(rq);

+again:
cpu = p->sched_class->select_task_rq(p, SD_BALANCE_WAKE, wake_flags);
+ if (!cpu_online(cpu))
+ cpu = cpumask_any_and(&p->cpus_allowed, cpu_active_mask);
+ if (cpu >= nr_cpu_ids) {
+ printk(KERN_ERR "Breaking affinity on %d/%s\n", p->pid, p->comm);
+ cpuset_cpus_allowed_locked(p, &p->cpus_allowed);
+ goto again;
+ }
+
if (cpu != orig_cpu)
set_task_cpu(p, cpu);

- rq = task_rq_lock(p, &flags);
+ rq = __task_rq_lock(p);

if (rq != orig_rq)
update_rq_clock(rq);