Re: frequent lockups in 3.18rc4

From: Dave Jones
Date: Tue Nov 18 2014 - 23:59:36 EST


On Tue, Nov 18, 2014 at 08:40:55PM -0800, Linus Torvalds wrote:
> On Tue, Nov 18, 2014 at 6:19 PM, Dave Jones <davej@xxxxxxxxxx> wrote:
> >
> > NMI watchdog: BUG: soft lockup - CPU#2 stuck for 21s! [trinity-c42:31480]
> > CPU: 2 PID: 31480 Comm: trinity-c42 Not tainted 3.18.0-rc5+ #91 [loadavg: 174.61 150.35 148.64 9/411 32140]
> > RIP: 0010:[<ffffffff8a1798b4>] [<ffffffff8a1798b4>] context_tracking_user_enter+0xa4/0x190
> > Call Trace:
> > [<ffffffff8a012fc5>] syscall_trace_leave+0xa5/0x160
> > [<ffffffff8a7d8624>] int_check_syscall_exit_work+0x34/0x3d
>
> Hmm, if we are getting soft-lockups here, maybe it suggest too much exit-work.
>
> Some TIF_NOHZ loop, perhaps? You have nohz on, don't you?

I do.

> That makes me wonder: does the problem go away if you disable NOHZ?

I'll give it a try, and see what falls out overnight.

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/