Re: Mysterious CFQ crash and RCU

From: Paul E. McKenney
Date: Sat Jun 04 2011 - 12:03:47 EST


On Sat, Jun 04, 2011 at 02:50:17PM +0200, Paul Bolle wrote:
> On Thu, 2011-06-02 at 22:07 -0700, Paul E. McKenney wrote:
> > And please accept my apologies for being so slow to get to it.
>
> Thanks, but it was just a week (ie, quite a quick response by my
> standards).
>
> > Looks healthy to me...
>
> How should I understand that? Something like: "As far as this hlist is
> used with RCU everything seems OK. Perhaps something is messing with the
> entries of this hlist outside of RCU. Perhaps additional locking is
> needed."

More like "based on these diagnostics, I see no evidence of the RCU
implementation misbehaving." Which is of course different than "I can
prove that the RCU implementation is not misbehaving". That said, the
fact that you are running on a single CPU makes it hard for me to see
any latitude for RCU-implementation misbehavior.

Clearly something is wrong somewhere. Given the fact that on a single-CPU
system, synchronize_rcu() is a no-op, and given that you weren't able
to reproduce with CONFIG_TREE_PREEMPT_RCU=y, my guess is that there is
a synchronize_rcu() that occasionally (illegally) gets executed within
an RCU read-side critical section.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/