Re: linux-next: Tree for March 25 (Call trace: RCU|workqueues|block|VFS|ext4related?)
From: Sedat Dilek
Date: Sun Mar 27 2011 - 08:26:23 EST
On Sun, Mar 27, 2011 at 7:07 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Sat, Mar 26, 2011 at 08:25:29PM -0700, Paul E. McKenney wrote:
>> On Sun, Mar 27, 2011 at 03:30:34AM +0200, Sedat Dilek wrote:
>> > On Sun, Mar 27, 2011 at 1:09 AM, Paul E. McKenney
>> > <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> > > On Sat, Mar 26, 2011 at 11:15:22PM +0100, Sedat Dilek wrote:
>
> [ . . . ]
>
>> > >> But then came RCU :-(.
>> > >
>> > > Well, if it turns out to be a problem in RCU I will certainly apologize.
>> > >
>> >
>> > No, that's not so dramatic.
>> > Dealing with this RCU issue has nice side-effects: I remembered (and
>> > finally did) to use a reduced kernel-config set.
>> > The base for it I created with 'make localmodconfig' and did some
>> > manual fine-tuning afterwards (throw out media, rc, dvd, unneeded FSs,
>> > etc.).
>> > Also, I can use fresh gcc-4.6 (4.6.0-1) from the official Debian repos.
>> >
>> > So, I started building with
>> > "revert-rcu-patches/0001-Revert-rcu-introduce-kfree_rcu.patch".
>> > I will let you know.
>>
>> And please also check for tasks consuming all available CPU.
>
> And I still cannot reproduce with the full RCU stack (but based off of
> 2.6.38 rather than -next). ÂNevertheless, if you would like to try a
> speculative patch, here you go.
>
You are right and my strategy on handling the (possible RCU?) issue is wrong.
Surely, you tested your RCU stuff in your own repo and everything
might be OK on top of stable 2.6.38.
Linux-next gets daily updates from a lot of different trees, so there
might be interferences with other stuff.
Please, understand I am interested in finding out what is the cause
for my issues, my aim is not to blame you.
As I was wrong and want to be 99.9% sure it is RCU stuff, I reverted
all (18) RCU patches from linux-next (next-20110325) by keeping the
RCU|PREEMPT|HZ settings from last working next-20110323.
$ egrep 'RCU|PREEMPT|_HZ' /boot/config-2.6.38-next20110325-7-686-iniza
# RCU Subsystem
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=32
# CONFIG_RCU_FANOUT_EXACT is not set
CONFIG_RCU_FAST_NO_HZ=y
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_NO_HZ=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
I will work and stress this kernel before doing any step-by-step
revert of RCU stuff.
Thanks for your patch, I applied it on top of "naked" next-20110325,
but I still see call-traces.
- Sedat -
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: further lower priority in rcu_yield()
>
> Although rcu_yield() dropped from real-time to normal priority, there
> is always the possibility that the competing tasks have been niced.
> So nice to 19 in rcu_yield() to help ensure that other tasks have a
> better chance of running.
>
> Â ÂSigned-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 759f54b..5477764 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1492,6 +1492,7 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
> Â Â Â Âmod_timer(&yield_timer, jiffies + 2);
> Â Â Â Âsp.sched_priority = 0;
> Â Â Â Âsched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> + Â Â Â set_user_nice(current, 19);
> Â Â Â Âschedule();
> Â Â Â Âsp.sched_priority = RCU_KTHREAD_PRIO;
> Â Â Â Âsched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/