Re: NMI Watchdog detected LOCKUP on CPU1 (stext_lock)(2.4.0-test9-pre2)

From: Keith Owens (kaos@ocs.com.au)
Date: Tue Sep 19 2000 - 20:39:03 EST


On Tue, 19 Sep 2000 19:53:19 +0200,
Jorge Nerin <jnerin@svalero.es> wrote:
>All the traces end up in stext_lock, so I think it' the same bug
>>>EIP; c01df3aa <stext_lock+32ba/7f30> <=====
>Trace; c015db32 <generic_make_request+ce/120>
>Trace; c015dd03 <ll_rw_block+17f/1f4>
>Trace; c0136149 <flush_dirty_buffers+91/d8>
>Trace; c01363fd <bdflush+8d/150>
>Trace; c01079bb <kernel_thread+23/30>
>Code; c01df3aa <stext_lock+32ba/7f30>
>00000000 <_EIP>:
>Code; c01df3aa <stext_lock+32ba/7f30> <=====
> 0: f3 90 repz nop <=====
>Code; c01df3ac <stext_lock+32bc/7f30>
> 2: 7e f5 jle fffffff9 <_EIP+0xfffffff9>
>c01df3a3 <stext_lock+32b3/7f30>
>Code; c01df3ae <stext_lock+32be/7f30>
> 4: e9 a6 da f7 ff jmp fff7daaf <_EIP+0xfff7daaf>
>c015ce59 <blk_get_queue+9/60>

Just because the traces end up in stext_lock does not mean that they
are the same bug. Locks are optimized for pipeline performance, the
code for "got the lock" is in the main text section, the code for
"cannot get lock, need to wait" is moved to a separate text section.
That way only the failure case gets pipeline stalls.

The downside of this optimization is that all code that is waiting for
a lock appears to be in the out of line section and the only label in
that section is right at the start. So all lock code appears to be in
stext_lock. What really matters is where the stext_lock code jumps
back to, that tells you which code is waiting for the lock. In this
case you have jumps back to blk_get_queue+9/60 so it is waiting on
io_request_lock. Now all you have to do is work out who is holding
onto io_request_lock.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:22 EST