Re: NMI Watchdog detected LOCKUP on CPU1 (stext_lock)(2.4.0-test9-pre2)

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Tue Sep 19 2000 - 20:44:30 EST


Keith,

I've seen a some problems with the way Linus (or whoever) did this. I
had a bug I worked on for 5 weeks related to the buggy 2.7 gcc linker on
Caldera Linux 2.4 that would for whatever reason fail to fixup all the
.test.lock code sections in a file (probably because there were so many
of them). It does not seem to be saving any memory space doing it this
way, since I've noticed tons of these little segments all over the
place.

I've seen a bug related to glibc and gcc on Caldera 2.4 with this
locking model, i.e. I saw 50+ references in a code section, with only 33
of them getting fixed up, so you might want to check if this person has
mixed and matched gcc and glibc versions, and perhaps the linker is
barfing or something.

:-)

Jeff

Keith Owens wrote:
>
> On Tue, 19 Sep 2000 19:53:19 +0200,
> Jorge Nerin <jnerin@svalero.es> wrote:
> >All the traces end up in stext_lock, so I think it' the same bug
> >>>EIP; c01df3aa <stext_lock+32ba/7f30> <=====
> >Trace; c015db32 <generic_make_request+ce/120>
> >Trace; c015dd03 <ll_rw_block+17f/1f4>
> >Trace; c0136149 <flush_dirty_buffers+91/d8>
> >Trace; c01363fd <bdflush+8d/150>
> >Trace; c01079bb <kernel_thread+23/30>
> >Code; c01df3aa <stext_lock+32ba/7f30>
> >00000000 <_EIP>:
> >Code; c01df3aa <stext_lock+32ba/7f30> <=====
> > 0: f3 90 repz nop <=====
> >Code; c01df3ac <stext_lock+32bc/7f30>
> > 2: 7e f5 jle fffffff9 <_EIP+0xfffffff9>
> >c01df3a3 <stext_lock+32b3/7f30>
> >Code; c01df3ae <stext_lock+32be/7f30>
> > 4: e9 a6 da f7 ff jmp fff7daaf <_EIP+0xfff7daaf>
> >c015ce59 <blk_get_queue+9/60>
>
> Just because the traces end up in stext_lock does not mean that they
> are the same bug. Locks are optimized for pipeline performance, the
> code for "got the lock" is in the main text section, the code for
> "cannot get lock, need to wait" is moved to a separate text section.
> That way only the failure case gets pipeline stalls.
>
> The downside of this optimization is that all code that is waiting for
> a lock appears to be in the out of line section and the only label in
> that section is right at the start. So all lock code appears to be in
> stext_lock. What really matters is where the stext_lock code jumps
> back to, that tells you which code is waiting for the lock. In this
> case you have jumps back to blk_get_queue+9/60 so it is waiting on
> io_request_lock. Now all you have to do is work out who is holding
> onto io_request_lock.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:22 EST