Re: Threaded irqs + 100% CPU RT task = RCU stall

From: Thomas Gleixner
Date: Wed Mar 06 2013 - 14:11:38 EST


On Wed, 6 Mar 2013, Paul E. McKenney wrote:

> On Wed, Mar 06, 2013 at 04:58:54PM +0100, Thomas Gleixner wrote:
> > On Wed, 6 Mar 2013, Paul Gortmaker wrote:
> > > So, I guess the question is, whether we want to try and make the system
> > > fail in a more meaningful way -- kind of like the rt throttling message
> > > does - as it lets users know they've hit the wall? Something watching
> >
> > That Joe Doe should have noticed the throttler message, which came
> > before the stall, shouldn't he?
> >
> > > for kstat_incr_softirqs traffic perhaps? Or other options?
> >
> > The rcu stall detector could use the softirq counter and if it did not
> > change in the stall period print: "Caused by softirq starvation" or
> > something like that.
>
> The idea is to (at grace-period start) take a snapshot of the CPU's
> value of kstat.softirqs[RCU_SOFTIRQ], then check it at stall time, right?

Yep.

> Or do I have the wrong softirq counter?

kstat_softirqs_cpu(RCU_SOFTIRQ, cpu) is the function you want to use.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/