Re: Threaded irqs + 100% CPU RT task = RCU stall

From: Paul E. McKenney
Date: Wed Mar 06 2013 - 12:17:46 EST


On Wed, Mar 06, 2013 at 04:58:54PM +0100, Thomas Gleixner wrote:
> On Wed, 6 Mar 2013, Paul Gortmaker wrote:
> > So, I guess the question is, whether we want to try and make the system
> > fail in a more meaningful way -- kind of like the rt throttling message
> > does - as it lets users know they've hit the wall? Something watching
>
> That Joe Doe should have noticed the throttler message, which came
> before the stall, shouldn't he?
>
> > for kstat_incr_softirqs traffic perhaps? Or other options?
>
> The rcu stall detector could use the softirq counter and if it did not
> change in the stall period print: "Caused by softirq starvation" or
> something like that.

The idea is to (at grace-period start) take a snapshot of the CPU's
value of kstat.softirqs[RCU_SOFTIRQ], then check it at stall time, right?

Or do I have the wrong softirq counter?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/