Re: linux-next: Tree for April 14 (Call-traces: RCU/ACPI/WQ related?)

From: Sedat Dilek
Date: Sun Apr 24 2011 - 05:36:52 EST


On Sun, Apr 24, 2011 at 8:27 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Sun, Apr 24, 2011 at 01:08:42AM +0200, Sedat Dilek wrote:
>> On Sun, Apr 24, 2011 at 1:04 AM, Sedat Dilek <sedat.dilek@xxxxxxxxxxxxxx> wrote:
>> > On Sat, Apr 23, 2011 at 11:16 PM, Sedat Dilek
>> > <sedat.dilek@xxxxxxxxxxxxxx> wrote:
>> >> On Sat, Apr 23, 2011 at 11:05 PM, Paul E. McKenney
>> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> >>> On Fri, Apr 22, 2011 at 07:36:58PM +0200, Sedat Dilek wrote:
>> >>>> On Fri, Apr 22, 2011 at 5:02 PM, Paul E. McKenney
>> >>>> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> >>>> > On Fri, Apr 22, 2011 at 11:40:54AM +0200, Sedat Dilek wrote:
>> >>>> >> On Fri, Apr 22, 2011 at 2:50 AM, Paul E. McKenney
>> >>>> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> >>>> >> > On Thu, Apr 21, 2011 at 04:47:31PM +0200, Sedat Dilek wrote:
>> >>>> >> >> On Thu, Apr 21, 2011 at 4:28 PM, Paul E. McKenney
>> >>>> >> >> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> >>>> >> >> > On Thu, Apr 21, 2011 at 02:49:37PM +0200, Sedat Dilek wrote:
>> >>>> >> >> >> On Thu, Apr 21, 2011 at 12:24 PM, Sedat Dilek
>> >>>> >> >> >> <sedat.dilek@xxxxxxxxxxxxxx> wrote:
>> >>>> >> >
>> >>>> >> > [ . . . ]
>> >>>> >> >
>> >>>> >> >> >> Here the results from the 2nd-run (PREEMPT_RCU enabled).
>> >>>> >> >> >
>> >>>> >> >> > OK, and the grace periods clearly stopped advancing early on.
>> >>>> >> >> >
>> >>>> >> >> > Beyond that point, the per-CPU kthread is blocked, but RCU has some
>> >>>> >> >> > work for it to do. ÂSo someone has called invoke_rcu_cpu_kthread(),
>> >>>> >> >> > but rcu_cpu_kthread() is still blocked. ÂI don't see a bug right
>> >>>> >> >> > off-hand, but it is early in the morning for me, so I might easily
>> >>>> >> >> > be missing something.
>> >>>> >> >> >
>> >>>> >> >> > Hmmm...
>> >>>> >> >> >
>> >>>> >> >> > The synchronization between these two assumes that the per-CPU
>> >>>> >> >> > kthread is always bound to the respective CPU, so if was somehow
>> >>>> >> >> > being migrated off, that might explain these results.
>> >>>> >> >> >
>> >>>> >> >> > I will add some more diagnostics, test them locally, then push
>> >>>> >> >> > out an update. ÂSeem reasonable?
>> >>>> >> >> >
>> >>>> >> >> > And thank you again for the testing!!!
>> >>>> >> >>
>> >>>> >> >> Ping me when you have new stuff for testing.
>> >>>> >> >> Tomorrow (friday), here is public holiday and monday, too.
>> >>>> >> >> So a looong weekend.
>> >>>> >> >
>> >>>> >> > ;-)
>> >>>> >> >
>> >>>> >> > OK, I have a new sedat.2011.04.21a branch in the -rcu git tree:
>> >>>> >> >
>> >>>> >> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
>> >>>> >> >
>> >>>> >> > This is against 2.6.39-rc3, as before. Â(Yes, I do need to rebase to
>> >>>> >> > 2.6.39-rc4, but didn't want to change any more than I had to.)
>> >>>> >> >
>> >>>> >> > I also have an updated script, which is attached. ÂThe output is similar
>> >>>> >> > to the earlier one, and it operated is pretty much the same way.
>> >>>> >> >
>> >>>> >> > Have a great weekend, and I look forward to seeing what shows up on
>> >>>> >> > this round. ÂI confess to still being quite puzzled!
>> >>>> >> >
>> >>>> >> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>> >>>> >> >
>> >>>> >>
>> >>>> >> Here are the results of the Sedat's vote (European song contest :-)).
>> >>>> >
>> >>>> > ;-)
>> >>>> >
>> >>>> > Very strange. ÂRCU has told the per-CPU kthread that it needs to get
>> >>>> > to work, but this kthread is still waiting from RCU's viewpoint.
>> >>>> > Yet the "ps" command believes that this kthread is in fact runnable
>> >>>> > at SCHED_FIFO priority 1.
>> >>>> >
>> >>>> > I can tell that this one will require some thought... ÂAnd more
>> >>>> > diagnostics...
>> >>>> >
>> >>>> > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>> >>>> >
>> >>>>
>> >>>> "We are with you in spirit."
>> >>>>
>> >>>> ( Level XX from Hybris shooter-game on Amiga (1989) )
>> >>>
>> >>> OK, I added a few more diagnostics: sedat.2011.04.23a in -rcu:
>> >>>
>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git
>> >>>
>> >>> When you get a chance, could you please give it a try?
>> >>>
>> >>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>> >>>
>> >>
>> >> As soon as I can clone/pull from new GIT repo/branch.
>> >> Currently, I don't see it only, but kernel-mirrors are sometimes slow.
>> >>
>> >> I will report later.
>> >>
>> >> - Sedat -
>> >>
>> >
>> > Hi Paul,
>> >
>> > I have seen CONFIG_DEBUG_OBJECTS_RCU_HEAD mentionned in the commits,
>> > so I enabled it here together with CONFIG_DEBUG_OBJECTS=y.
>> > Hope this is OK.
>> > Am I missing other useful (*DEBUG_OBJECT*) kernel options?
>> >
>> > - Sedat -
>> >
>> > P.S.: Enabled CONFIG_DEBUG_OBJECTS=y and
>> > CONFIG_DEBUG_OBJECTS_RCU_HEAD=y (and some more)
>> >
>> > # diff -uprN for-paulk-3/config-2.6.39-rc3-preempt-rcu-sedat.2011.04.21a+
>> > for-paulk-4/config-2.6.39-rc3-preempt-rcu-sedat.2011.04.23a+
>> > --- for-paulk-3/config-2.6.39-rc3-preempt-rcu-sedat.2011.04.21a+
>> > Â2011-04-22 10:25:42.000000000 +0200
>> > +++ for-paulk-4/config-2.6.39-rc3-preempt-rcu-sedat.2011.04.23a+
>> > Â2011-04-24 00:35:37.000000000 +0200
>> > @@ -1,7 +1,7 @@
>> > Â#
>> > Â# Automatically generated make config: don't edit
>> > Â# Linux/i386 2.6.39-rc3 Kernel Configuration
>> > -# Fri Apr 22 09:54:37 2011
>> > +# Sat Apr 23 23:58:52 2011
>> > Â#
>> > Â# CONFIG_64BIT is not set
>> > ÂCONFIG_X86_32=y
>> > @@ -3065,7 +3065,14 @@ CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
>> > ÂCONFIG_SCHED_DEBUG=y
>> > Â# CONFIG_SCHEDSTATS is not set
>> > ÂCONFIG_TIMER_STATS=y
>> > -# CONFIG_DEBUG_OBJECTS is not set
>> > +CONFIG_DEBUG_OBJECTS=y
>> > +# CONFIG_DEBUG_OBJECTS_SELFTEST is not set
>> > +# CONFIG_DEBUG_OBJECTS_FREE is not set
>> > +CONFIG_DEBUG_OBJECTS_TIMERS=y
>> > +# CONFIG_DEBUG_OBJECTS_WORK is not set
>> > +CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
>> > +CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
>> > +CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1
>> > Â# CONFIG_SLUB_DEBUG_ON is not set
>> > Â# CONFIG_SLUB_STATS is not set
>> > Â# CONFIG_DEBUG_KMEMLEAK is not set
>> >
>>
>> Oops, forget the dmesg output.
>
> OK, this looks unrelated, but just in case, could you please try it
> again with the following patch? Â(Not mainlinable, debug only.)
>
> Also, it does look like you are still seeing a grace-period hang.
> Could you please send the output of the script? ÂSame one as last time.
>
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>
> ------------------------------------------------------------------------
>
> Âdebugobjects.c | Â Â8 +++++---
> Â1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 9d86e45..10a7c7a 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -289,10 +289,12 @@ static void debug_object_is_on_stack(void *addr, int onstack)
> Â Â Â Â Â Â Â Âreturn;
>
> Â Â Â Âlimit++;
> - Â Â Â if (is_on_stack)
> + Â Â Â if (is_on_stack) {
> + Â Â Â Â Â Â Â struct rcu_head *p = (struct rcu_head *)addr;
> Â Â Â Â Â Â Â Âprintk(KERN_WARNING
> - Â Â Â Â Â Â Â Â Â Â Â"ODEBUG: object is on stack, but not annotated\n");
> - Â Â Â else
> + Â Â Â Â Â Â Â Â Â Â Â"ODEBUG: object is on stack, but not annotated: %p\n",
> + Â Â Â Â Â Â Â Â Â Â Âp->func);
> + Â Â Â } else
> Â Â Â Â Â Â Â Âprintk(KERN_WARNING
> Â Â Â Â Â Â Â Â Â Â Â "ODEBUG: object is not on stack, but annotated\n");
> Â Â Â ÂWARN_ON(1);
>

Somehow your attached patch was not applicable.
As the changes were a few lines I applied it by myself.
Attached are log, dmesg and patches (orig + mine)

- Sedat -

Attachment: for-paulk-5.tar.xz
Description: Binary data

Attachment: for-paulk-5.tar.xz.sha256sum
Description: Binary data