On Sun, 2007-03-11 at 22:09 +0100, Rafael J. Wysocki wrote:
Definitely something strange is going on here.Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.i tried to boot with nohz=off, but the problem did persist.Does the problem go away if NO_HZ is unset?Yes, it's in wait_for_completion() in synchronize_rcu().update_sched_domainsWell, I think the call to wait_for_completion() does not return, probably
detach_destroy_domains
[waits here] --> synchronize_sched (==synchronize_rcu)
because the task supposed to complete the completion is frozen at this
point. Can you please try to confirm that it gets stuck on
wait_for_completion() in synchronize_rcu()?
As noted in some previous mail, it will wake up after
event - key press etc.
Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
I added it to my quilt and applied anyway -> no change.
I think we need an advice from someone who knows the RCU internals.
RCU synchronization depends on the timer interrupt. Which kernel version
are you guys talking about ?
tglx