Re: [2.6.30-rc1] RCU detected CPU 1 stall

From: Al Viro
Date: Fri Apr 10 2009 - 11:04:18 EST


On Fri, Apr 10, 2009 at 07:22:03AM -0700, Paul E. McKenney wrote:

> Hmmmm... This indicates that CPU 1 was spinning in the kernel for
> a long time. At 250 HZ, 32,565 jiffies is 130 seconds, or just over
> two -minutes-. Ouch!!!
>
> The interrupt happened on the stalled CPU, so we know that interrupts
> were enabled. Because we have CONFIG_PREEMPT_NONE=y, there is no
> preemption, so preemption need not be disabled. This could be due
> to lock contention, or even a simple infinite loop.
>
> The timer interrupt (apic_timer_interrupt) occurred in either
> __bprm_mm_init(), __get_user_4(), count(), or do_execve(). There
> have been some recent changes around check_unsafe_exec() -- any
> possibility that these introduced excessive lock contention or
> an infinite loop? Ditto for the recent security fixes?

Oh, joy... the loop in there is this:
for (t = next_thread(p); t != p; t = next_thread(t)) {
if (t->fs == p->fs)
n_fs++;
}
I find it hard to believe that it can take two minutes, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/