More info (disassembled) on UP flu ("lock")

Romano Giannetti (romano@upco.es)
Thu, 19 Nov 1998 11:29:38 +0100


I did some more research about the "UP flu" for kernel 2.1.127. I hope
that someone of you guru people can use this info to track down the
bug.

To resume: I have a PII system, IDE-only, almost no-problem (Note 2),
no APM enabled, UP system and UP kernel, ethernet 3c509 ISA,
monolithic kernel. "Lock" was only for new processes (they was extremely
slow, not locked. In 20 minutes I managed to have a new rxvt (but no
prompt)). I had the new processes looping tightly in
schedule(), but other processes (xload, xclock, xwatch) were running
happily, although a bit slower (and yes, xwatch did hit the disk, and
it was running happily and displaying the containts of
/var/log/messages).

When my machine was locked, almost all the time EIP was in
<schedule+0x116>. To try to find where it was, I recompiled schedule
with the exact some flags of the kernel build (Note 1) but adding -g,
then disassembled it with gdb. Around that point I have:

0x546 <schedule+250>: sti
0x547 <schedule+251>: movl $0xfffffc18,%edi
0x54c <schedule+256>: movl $0x0,%ecx
0x551 <schedule+261>: cmpl $0x0,%ebx
0x557 <schedule+267>: je 0x5d1 <schedule+389>
0x559 <schedule+269>: leal 0x0(%esi),%esi
0x55c <schedule+272>: movl 0x90(%ebx),%edx
0x562 <schedule+278>: movl %edx,0xfffffffc(%ebp)
0x565 <schedule+281>: testb $0x10,%dl
0x568 <schedule+284>: je 0x57c <schedule+304>
0x56a <schedule+286>: andb $0xef,%dl
0x56d <schedule+289>: movl %edx,0x90(%ebx)
0x573 <schedule+295>: movl $0x0,0xfffffffc(%ebp)
0x57a <schedule+302>: jmp 0x5bc <schedule+368>
0x57c <schedule+304>: cmpl $0x0,0xfffffffc(%ebp)
0x580 <schedule+308>: je 0x594 <schedule+328>
0x582 <schedule+310>: movl 0x94(%ebx),%eax
0x588 <schedule+316>: addl $0x3e8,%eax
0x58d <schedule+321>: movl %eax,0xfffffffc(%ebp)
0x590 <schedule+324>: jmp 0x5bc <schedule+368>

I could not manage to have the corresponding C point, really... If
I do a list *0x562 gdb tell me:

0x562 is in sched_init (sched.c:631).
626 return;
627
628 scheduling_in_interrupt:
629 printk("Scheduling in interrupt\n");
630 *(int *)0 = 0;
631 }
632
633
634 rwlock_t waitqueue_lock = RW_LOCK_UNLOCKED;
635

??? Probably some optimization going on. It does not seem to
correspond to the assembler code.

Note 1: I did:

gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -g \
-O2 -fomit-frame-pointer -pipe -fno-strength-reduce -m486 \
-malign-loops=2 -malign-jumps=2 -malign-functions=2 -DCPU=686 \
-fno-omit-frame-pointer -c -o sched.o sched.c

...why -fomit-frame-pointer -fno-omit-frame-pointer ??? Is it normal?
Is just a local (for this file) override of standard flags?

Note 2: almost means:

I cannot play CD audio. After a bit, I have a: hdc: lost interrupt,
and after that I had to reboot to access again the CDROM (or to kill
the cdplay program blocked in D state). I am at the point were I think
it is flaky hw, although I think that the kernel should have a better
way to resume... I mean, a reset() on the interface after a lost
interrupt, for example, so that I can continue to work.

Hope this helps,
Romano

PS what does "flu" mean? :)

-- 
Romano Giannetti, Professor  -  Univ. Pontificia Comillas (Madrid, Spain)
Electronic Engineer - phone +34 915 422 800 ext 2410  fax +34 915 596 569

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/