Re: frequent lockups in 3.18rc4
From: Paul E. McKenney
Date: Tue Dec 02 2014 - 12:04:52 EST
On Tue, Dec 02, 2014 at 02:43:17PM -0200, Dâniel Fraga wrote:
> On Mon, 1 Dec 2014 15:08:13 -0800
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > Well, this turned out to be way simpler than I expected. Passes
> > light rcutorture testing. Sometimes you get lucky...
>
> Linus, Paul and others, I finally got a call trace with
> only CONFIG_TREE_PREEMPT_RCU *disabled* using Paul's patch (to trigger
> it I compiled PHP with make -j8).
Is it harder to reproduce with CONFIG_PREEMPT=y and CONFIG_TREE_PREEMPT_RCU=n?
If it is a -lot- harder to reproduce, it might be worth bisecting among
the RCU read-side critical sections. If making a few of them be
non-preemptible greatly reduces the probability of the bug occuring,
that might provide a clue about root cause.
On the other hand, if it is just a little harder to reproduce, this
RCU read-side bisection would likely be an exercise in futility.
Thanx, Paul
> Dec 2 14:24:39 tux kernel: [ 8475.941616] conftest[9730]: segfault at 0 ip 0000000000400640 sp 00007fffa67ab300 error 4 in conftest[400000+1000]
> Dec 2 14:24:40 tux kernel: [ 8476.104725] conftest[9753]: segfault at 0 ip 00007f6863024906 sp 00007fff0e31cc48 error 4 in libc-2.19.so[7f6862efe000+1a1000]
> Dec 2 14:25:54 tux kernel: [ 8550.791697] INFO: rcu_sched detected stalls on CPUs/tasks: { 4} (detected by 0, t=60002 jiffies, g=112854, c=112853, q=0)
> Dec 2 14:25:54 tux kernel: [ 8550.791702] Task dump for CPU 4:
> Dec 2 14:25:54 tux kernel: [ 8550.791703] cc1 R running task 0 14344 14340 0x00080008
> Dec 2 14:25:54 tux kernel: [ 8550.791706] 000000001bcebcd8 ffff880100000003 ffffffff810cb7f1 ffff88021f5f5c00
> Dec 2 14:25:54 tux kernel: [ 8550.791708] ffff88011bcebfd8 ffff88011bcebce8 ffffffff811fb970 ffff8802149a2a00
> Dec 2 14:25:54 tux kernel: [ 8550.791710] ffff8802149a2cc8 ffff88011bcebd28 ffffffff8103e979 ffff88020ed01398
> Dec 2 14:25:54 tux kernel: [ 8550.791712] Call Trace:
> Dec 2 14:25:54 tux kernel: [ 8550.791718] [<ffffffff810cb7f1>] ? release_pages+0xa1/0x1e0
> Dec 2 14:25:54 tux kernel: [ 8550.791722] [<ffffffff811fb970>] ? cpumask_any_but+0x30/0x40
> Dec 2 14:25:54 tux kernel: [ 8550.791725] [<ffffffff8103e979>] ? flush_tlb_page+0x49/0xf0
> Dec 2 14:25:54 tux kernel: [ 8550.791727] [<ffffffff810cbe72>] ? lru_cache_add_active_or_unevictable+0x22/0x90
> Dec 2 14:25:54 tux kernel: [ 8550.791731] [<ffffffff810fc4c2>] ? alloc_pages_vma+0x72/0x130
> Dec 2 14:25:54 tux kernel: [ 8550.791733] [<ffffffff810cbe72>] ? lru_cache_add_active_or_unevictable+0x22/0x90
> Dec 2 14:25:54 tux kernel: [ 8550.791735] [<ffffffff810e5220>] ? handle_mm_fault+0x3a0/0xaf0
> Dec 2 14:25:54 tux kernel: [ 8550.791737] [<ffffffff81039074>] ? __do_page_fault+0x224/0x4c0
> Dec 2 14:25:54 tux kernel: [ 8550.791740] [<ffffffff8110d54c>] ? new_sync_write+0x7c/0xb0
> Dec 2 14:25:55 tux kernel: [ 8550.791743] [<ffffffff8114765c>] ? fsnotify+0x27c/0x350
> Dec 2 14:25:55 tux kernel: [ 8550.791746] [<ffffffff81087233>] ? rcu_eqs_enter+0x93/0xa0
> Dec 2 14:25:55 tux kernel: [ 8550.791748] [<ffffffff81087a5e>] ? rcu_user_enter+0xe/0x10
> Dec 2 14:25:55 tux kernel: [ 8550.791749] [<ffffffff8103938a>] ? do_page_fault+0x5a/0x70
> Dec 2 14:25:55 tux kernel: [ 8550.791752] [<ffffffff8139d9d2>] ? page_fault+0x22/0x30
>
> If you need more info/testing, just ask.
>
> --
> Linux 3.17.0-dirty: Shuffling Zombie Juror
> http://www.youtube.com/DanielFragaBR
> http://exchangewar.info
> Bitcoin: 12H6661yoLDUZaYPdah6urZS5WiXwTAUgL
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/