Re: [GIT PULL rcu/next] rcu commits for 2.6.40

From: Ingo Molnar
Date: Mon May 16 2011 - 03:39:46 EST

* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> > And the way you can prove that it is my code rather than the arch
> > code is to show that the warning happens on your system when the
> > irq_enter()/irq_exit() calls are perfectly nested.
> So I took another look at the RCU debugfs stats you provided earlier,
> and realized that your system gets a lot more NMIs than do the ones
> that I have access to. So as a diagnostic patch, I ifdefed out the
> body of rcu_nmi_enter() and rcu_nmi_exit().

Well, but the delays are occuring all the time (and it's bisectable) and NMIs
are generally not deterministic.

I'd really suggest the creation of a revert + finegrained series on top of
core/rcu which would IMHO help us narrow this down a lot more directly than
jumping between the 'need_resched bug', 'nmi bug' and 'barrier bug' hypoteses.

( Btw., the bug still has the feeling of a need_resched/scheduling/timing
artifact to me, not barriers or NMI. )


