Re: [PATCH diagnostic] Re: HPET regression in 2.6.26 versus 2.6.25-- RCU problem

From: Ingo Molnar
Date: Mon Aug 11 2008 - 07:39:01 EST



* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> And here is the patch. It is still a bit raw, so the results should
> be viewed with some suspicion. It adds a default-off kernel parameter
> CONFIG_RCU_CPU_STALL which must be enabled.
>
> Rather than exponential backoff, it backs off to once per 30 seconds.
> My feeling upon thinking on it was that if you have stalled RCU grace
> periods for that long, a few extra printk() messages are probably the
> least of your worries...

while this wont debug problems were timer irqs are genuinely stuck for
long periods of time, it should find problems with RCU completion logic
itself in the presence of correct timer irqs - and the lack of any
messages from this debug option should point the finger more firmly in
the direction of stalled timer irqs.

So i find this debug feature rather useful and have applied it to
tip/core/rcu (and cleaned it up a bit). I renamed the config option to
CONFIG_DEBUG_RCU_STALL to make it more in line with usual debug option
names. Lets see whether -tip testing finds any false positives.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/