Re: RCU stalls -> lockup.

From: Dave Jones
Date: Mon Oct 06 2014 - 21:28:01 EST


On Sat, Oct 04, 2014 at 10:15:56PM -0400, Tejun Heo wrote:
> On Thu, Oct 02, 2014 at 12:36:55PM -0700, Paul E. McKenney wrote:
> > On Thu, Oct 02, 2014 at 01:55:15PM -0400, Dave Jones wrote:
> > > I just hit this on my box running 3.17rc7
> > > It was followed by a userspace lockup. (Could still ping, and sysrq
> > > from the console, but even getty wasn't responding on the console).
> > >
> > > I was trying to reproduce another bug faster, and had ramped up the
> > > number of processes trinity to uses to 512. This didn't take long
> > > to fall out..
> >
> > This might be related to an exchange I had with Tejun (CCed), where
> > the work queues were running all out, preventing any quiescent states
> > from happening. One fix under consideration is to add a quiescent state,
> > similar to the one in softirq handling.
>
> Dave, can you please test whether the following patch makes a
> difference if the problem is reproducible?
>
> http://lkml.kernel.org/r/20141003153701.7c7da030@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

initial tests look good, haven't seen any reoccurance of the problem so far.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/