Re: [Patch] restore the RCU callback to defer put_task_struct() Re: Problems with 2.6.17-rt8

From: Robert Crocombe
Date: Mon Aug 14 2006 - 13:54:17 EST


I assume that replies were trimmed on accident, so I have resplattered everyone.

On 8/11/06, hui Bill Huey <billh@xxxxxxxxxxxxxxxxx> wrote:
Can you try it without RAID ? do you have a RAID set up ? You have tons
of options turned on in your .config, if you can isolate the problematic
system that, would be helpful to me since I don't have a lot of hardware.

Boo, I thought my config was pretty concise :(. Okay, I really
chopped it this time.

And yeah, it is a RAID config. But for extra bonus points, I found a
spare SCSI disk and installed Fedora Core 5 and did a 'yum upgrade' to
whatever was current as of today. So it's a single disk config now.
Problem still occurs with 't2' patched kernel.

config and dmesg attached, kaboom-like stuff appended.

--
Robert Crocombe
rcrocomb@xxxxxxxxx

BUG: scheduling while atomic: cc1/0x00000001/6787

Call Trace:
<ffffffff8024ff01>{stop_trace+9}
<ffffffff803f007a>{__schedule+165}
<ffffffff80209c99>{mcount+45}
<ffffffff80209c99>{mcount+45}
<ffffffff80249758>{atomic_dec_and_spin_lock+26}
<ffffffff803f0e2e>{schedule+246}
<ffffffff803f1dac>{rt_lock_slowlock+356}
<ffffffff803f26fa>{rt_lock+18}
<ffffffff80249758>{atomic_dec_and_spin_lock+26}
<ffffffff80237396>{free_uid+90}
<ffffffff8022b7dc>{__put_task_struct+147}
<ffffffff803f0b62>{thread_return+283}
<ffffffff80209c99>{mcount+45}
<ffffffff803f1ce6>{rt_lock_slowlock+158}
<ffffffff803f2780>{rt_read_lock+76}
<ffffffff803f0c80>{preempt_schedule+90}
<ffffffff803f2e1a>{_raw_spin_unlock+60}
<ffffffff803f1ce6>{rt_lock_slowlock+158}
<ffffffff803f2778>{rt_read_lock+68}
<ffffffff803f26fa>{rt_lock+18}
<ffffffff803f2780>{rt_read_lock+76}
<ffffffff80256f06>{find_get_page+40}
<ffffffff8025778b>{do_generic_mapping_read+371}
<ffffffff802596cb>{file_read_actor+0}
<ffffffff802582c0>{__generic_file_aio_read+344}
<ffffffff8025834e>{generic_file_aio_read+61}
<ffffffff8027c1d0>{do_sync_read+245}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff80209c99>{mcount+45}
<ffffffff8027cb36>{vfs_read+176}
<ffffffff8027cf6c>{sys_read+76}
<ffffffff80209e27>{system_call+239}
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f0092>] .... __schedule+0xbd/0xa72
.....[<ffffffff803f0c80>] .. ( <= preempt_schedule+0x5a/0x90)

kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:471

Call Trace:
<ffffffff803f29a3>{_raw_spin_lock_irqsave+34}
<ffffffff8022db19>{__WARN_ON+105}
<ffffffff8022dad4>{__WARN_ON+36}
<ffffffff802494a7>{debug_rt_mutex_unlock+204}
<ffffffff803f1b89>{rt_lock_slowunlock+30}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff80278660>{kmem_cache_alloc+207}
<ffffffff80259cc8>{mempool_alloc_slab+22}
<ffffffff8025a0b7>{mempool_alloc+80}
<ffffffff80209c99>{mcount+45}
<ffffffff802817e2>{bio_alloc_bioset+40}
<ffffffff80281902>{bio_alloc+21}
<ffffffff8027ded8>{submit_bh+142}
<ffffffff8027f183>{ll_rw_block+166}
<ffffffff802cefae>{journal_commit_transaction+995}
<ffffffff803f2e11>{_raw_spin_unlock+51}
<ffffffff803f1bb1>{rt_lock_slowunlock+70}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff802365a1>{try_to_del_timer_sync+90}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff802d351d>{kjournald+204}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff802d3451>{kjournald+0}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240b1f>{kthread+224}
<ffffffff80228e70>{schedule_tail+190}
<ffffffff8020b03e>{child_rip+8}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240a3f>{kthread+0}
<ffffffff8020b036>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f28bf>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1b81>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f29a3>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dad4>] .. ( <= __WARN_ON+0x24/0x88)

kjournald/1061[CPU#3]: BUG in debug_rt_mutex_unlock at
kernel/rtmutex-debug.c:472

Call Trace:
<ffffffff803f29a3>{_raw_spin_lock_irqsave+34}
<ffffffff8022db19>{__WARN_ON+105}
<ffffffff8022dad4>{__WARN_ON+36}
<ffffffff80249549>{debug_rt_mutex_unlock+366}
<ffffffff803f1b89>{rt_lock_slowunlock+30}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff80278660>{kmem_cache_alloc+207}
<ffffffff80259cc8>{mempool_alloc_slab+22}
<ffffffff8025a0b7>{mempool_alloc+80}
<ffffffff80209c99>{mcount+45}
<ffffffff802817e2>{bio_alloc_bioset+40}
<ffffffff80281902>{bio_alloc+21}
<ffffffff8027ded8>{submit_bh+142}
<ffffffff8027f183>{ll_rw_block+166}
<ffffffff802cefae>{journal_commit_transaction+995}
<ffffffff803f2e11>{_raw_spin_unlock+51}
<ffffffff803f1bb1>{rt_lock_slowunlock+70}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff802365a1>{try_to_del_timer_sync+90}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff802d351d>{kjournald+204}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff802d3451>{kjournald+0}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240b1f>{kthread+224}
<ffffffff80228e70>{schedule_tail+190}
<ffffffff8020b03e>{child_rip+8}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240a3f>{kthread+0}
<ffffffff8020b036>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f28bf>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1b81>] .. ( <= rt_lock_slowunlock+0x16/0x70)
.. [<ffffffff803f29a3>] .... _raw_spin_lock_irqsave+0x22/0x33
.....[<ffffffff8022dad4>] .. ( <= __WARN_ON+0x24/0x88)

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/rtmutex.c:639
invalid opcode: 0000 [1] PREEMPT SMP
CPU 3
Modules linked in: tg3
Pid: 1061, comm: kjournald Not tainted 2.6.17-rt8_t2_00 #1
RIP: 0010:[<ffffffff803f1d02>] <ffffffff803f1d02>{rt_lock_slowlock+186}
RSP: 0018:ffff81040022bb98 EFLAGS: 00010246
RAX: ffff8102001c6000 RBX: 0000000000000010 RCX: 0000000000240180
RDX: ffff8102001c6000 RSI: ffffffff802785e5 RDI: ffff810600115b00
RBP: ffff81040022bc58 R08: ffff8104000d7dd8 R09: ffff81040022bb98
R10: ffff8104000d7dd8 R11: 0000000000000023 R12: ffff810600115b00
R13: ffff8101eafabb80 R14: ffffffff802785e5 R15: 0000000000000010
FS: 00002b8c5ddbd6f0(0000) GS:ffff8106001eb340(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b8c5dc07bf0 CR3: 0000000000201000 CR4: 00000000000006e0
Process kjournald (pid: 1061, threadinfo ffff81040022a000, task
ffff8102001c6000)
Stack: 111111110000008c ffff81040022bba0 ffff81040022bba0 ffff81040022bbb0
ffff81040022bbb0 0000000000000000 111111110000008c ffff81040022bbd0
ffff81040022bbd0 ffff81040022bbe0
Call Trace:
<ffffffff803f26fa>{rt_lock+18}
<ffffffff802785e5>{kmem_cache_alloc+84}
<ffffffff80259cc8>{mempool_alloc_slab+22}
<ffffffff8025a0b7>{mempool_alloc+80}
<ffffffff80281851>{bio_alloc_bioset+151}
<ffffffff80281902>{bio_alloc+21}
<ffffffff8027ded8>{submit_bh+142}
<ffffffff8027f183>{ll_rw_block+166}
<ffffffff802cefae>{journal_commit_transaction+995}
<ffffffff803f2e11>{_raw_spin_unlock+51}
<ffffffff803f1bb1>{rt_lock_slowunlock+70}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff803f26e6>{__lock_text_start+14}
<ffffffff802365a1>{try_to_del_timer_sync+90}
<ffffffff803f0b2d>{thread_return+230}
<ffffffff802d351d>{kjournald+204}
<ffffffff80240c5f>{autoremove_wake_function+0}
<ffffffff802d3451>{kjournald+0}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240b1f>{kthread+224}
<ffffffff80228e70>{schedule_tail+190}
<ffffffff8020b03e>{child_rip+8}
<ffffffff80240831>{keventd_create_kthread+0}
<ffffffff80240a3f>{kthread+0}
<ffffffff8020b036>{child_rip+0}
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<ffffffff803f28bf>] .... _raw_spin_lock+0x1b/0x28
.....[<ffffffff803f1c83>] .. ( <= rt_lock_slowlock+0x3b/0x213)
.. [<ffffffff803f2bd9>] .... _raw_spin_trylock+0x1b/0x5f
.....[<ffffffff8020b484>] .. ( <= oops_begin+0x28/0x77)


Code: 0f 0b 68 1b 3d 41 80 c2 7f 02 65 48 8b 04 25 00 00 00 00 41
RIP <ffffffff803f1d02>{rt_lock_slowlock+186} RSP <ffff81040022bb98>
<6>note: kjournald[1061] exited with preempt_count 1

Attachment: 2.6.17-rt8-t2_00
Description: Binary data

Attachment: dmesg_2.6.17-rt8-t2_00
Description: Binary data