Re: WARNING: at kernel/rcutorture.c:1243 rcu_torture_printk

From: Paul E. McKenney
Date: Tue Jun 11 2013 - 19:03:35 EST


On Tue, Jun 11, 2013 at 10:14:55AM +0800, Fengguang Wu wrote:
> Paul,
>
> On Mon, Jun 10, 2013 at 07:51:10AM -0700, Paul E. McKenney wrote:
> > On Mon, Jun 10, 2013 at 03:47:28PM +0800, Fengguang Wu wrote:
> > > Greetings,
> > >
> > > I got the below dmesg and the first bad commit is
> > >
> > > commit 911af505ef407c2511106c224dd640f882f0f590
> > > Author: Paul E. McKenney <paul.mckenney@xxxxxxxxxx>
> > > Date: Mon Feb 11 10:23:27 2013 -0800
> > >
> > > rcu: Provide compile-time control for no-CBs CPUs
> > >
> > > Currently, the only way to specify no-CBs CPUs is via the rcu_nocbs
> > > kernel command-line parameter. This is inconvenient in some cases,
> > > particularly for randconfig testing, so this commit adds a new set of
> > > kernel configuration parameters. CONFIG_RCU_NOCB_CPU_NONE (the default)
> > > retains the old behavior, CONFIG_RCU_NOCB_CPU_ZERO offloads callback
> > > processing from CPU 0 (along with any other CPUs specified by the
> > > rcu_nocbs boot-time parameter), and CONFIG_RCU_NOCB_CPU_ALL offloads
> > > callback processing from all CPUs.
> > >
> > > Signed-off-by: Paul E. McKenney <paul.mckenney@xxxxxxxxxx>
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > >
> > > However I guess it's a wrong bisect, the commit should be unmasking an
> > > old bug because it merely provides more kconfig options.
> > >
> > > This warning happened only once in 74 boots of the kernel.
> > >
> > > I also attached the 2nd dmesg which shows boot hang for the same
> > > kernel.
> > >
> > > Note that there are also 2 kernel hangs in the 74 boots. It should
> > > not be an RCU problem, however for completeness, I also attach a dmesg
> > > for your reference when considering this RCU warning.
> >
> > Either way, that is a nasty warning! You get this on recent kernels as
> > well, I take it?
>
> Yes, the warning is still in recent kernels, indicated from the last
> lines of the bisect log.
>
> 911af505ef407c2511106c224dd640f882f0f590 is bad and its parent
> 34ed62461ae4970695974afb9a60ac3df0086830 is good:
> > > git bisect bad 911af505ef407c2511106c224dd640f882f0f590 # 65 2013-06-07 16:14:41 rcu: Provide compile-time control for no-CBs CPUs
> > > git bisect good 34ed62461ae4970695974afb9a60ac3df0086830 # 900 2013-06-08 04:57:34 rcu: Remove restrictions on no-CBs CPUs
>
> After bisect, the script tries to confirm that
> 34ed62461ae4970695974afb9a60ac3df0086830 is really good by boot
> testing it 2700 more times:
>
> > > git bisect good 34ed62461ae4970695974afb9a60ac3df0086830 # 2700 2013-06-10 14:33:56 rcu: Remove restrictions on no-CBs CPUs
>
> And continue to find out whether the linus/master and
> linux-next/master are good/bad (here the results are both bad):
>
> > > git bisect bad f43e7a34255a387930bc2bb826ec17c052ce975e # 14:34 0 Merge branch 'for-next'
> > > git bisect bad c04efed734409f5a44715b54a6ca1b54b0ccf215 # 14:41 11 Add linux-next specific files for 20130607

Just to let you know that I am putting some time into this...

Well, I got a panic, but it turned out to be the kernel being unable to
locate init. Why your .config should have this effect, I have no idea --
I don't see any Kconfig parameters that look to me like they should be
specifying where init lives. Any hints?

Left to myself, my next step will be to try just the RCU-related Kconfig
parameters.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/