ksymoops from diskless crash

From: Fred Feirtag (ffeirtag@integrityns.com)
Date: Fri Aug 11 2000 - 16:09:39 EST


Greetings--

Looks like there's some sort of bug crashing my etherboot-diskless
workstation usage. I see it only on about 5% of boots. Here's
ksymoops:

[root@linux82 /boot]$ ksymoops -m /usr/src/linux-diskless/System.map
ksymoops 2.3.4 on i686 2.4.0-test6. Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.0-test6/ (default)
     -m /usr/src/linux-diskless/System.map (specified)

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid
lsmod
file?
Reading Oops report from the terminal
se "unset watch" printing eip:
.
c010da24
*pc010da24
de = 00000000
Oop*pde = 00000000
s: 0002
Oops: 0002
CPU: 0
CPU: 0
EIP: 0010:[<c010da24>]
eax: 0EIP: 0010:[<c010da24>]
Using defaults from ksymoops -t elf32-i386 -a i386
0000000 ebx: 00000000 ecx: 0000002b edx: 00000202
eax: 00000000 ebx: 00000000 ecx: 0000002b edx: 00000202
esi: bfff7020 edi: bfff72f4 ebp: bfff7078 esp: c7a75e80
dsesi: bfff7020 edi: bfff72f4 ebp: bfff7078 esp: c7a75e80
: 0018 es: 0018 ss: 0018
Prods: 0018 es: 0018 ss: 0018
cess linuxrc (pid: 5, stackpage=c7a75000)
Stack:Process linuxrc (pid: 5, stackpage=c7a75000)
 c010858a bfff7078 bfff7018 c7a75fc4 c7a75fb0 c7a74000 c7a74000
Stack: c010858a bfff7078 bfff7018 c7a75fc4 c7a75fb0 c7a74000 c7a74000
00000286
       00000286
c010869f bfff7020 bfff7078 c7a75fc4 00010002 c7a73144 00000011
c c010869f bfff7020 bfff7078 c7a75fc4 00010002 c7a73144 00000011
7a75fb0
c7a75fb0
       c7a75fb0 c0108a69 00000011 c7a73144 c7a75fb0 c7a75fc4 00000011
c7 c7a75fb0 c0108a69 00000011 c7a73144 c7a75fb0 c7a75fc4 00000011
a74000
Call Tracc7a74000
e: [<c010858a>] [<c010869f>] [<c0108a69>] [<c0108ce2>]
[Call Trace: [<c010858a>] [<c010869f>] [<c0108a69>] [<c0108ce2>]
<c0114c0f>] [<c0108074>] [<c0108e2b>]
Co[<c0114c0f>] [<c0108074>] [<c0108e2b>]
de: 00 01 00 d0 59 05 91 d3 c0 a8 0f 71 00 00 00 00 00 00 c0 a8
Code: 00 01 00 d0 59 05 91 d3 c0 a8 0f 71 00 00 00 00 00 00 c0 a8

>>EIP; c010da24 <save_i387+0/6c> <=====
Trace; c010858a <setup_sigcontext+ea/130>
Trace; c010869f <setup_frame+cf/1a4>
Trace; c0108a69 <handle_signal+75/e8>
Trace; c0108ce2 <do_signal+206/26c>
Trace; c0114c0f <schedule+2c3/4b4>
Trace; c0108074 <sys_rt_sigsuspend+e0/f4>
Trace; c0108e2b <system_call+33/38>
Code; c010da24 <save_i387+0/6c>
00000000 <_EIP>:
Code; c010da24 <save_i387+0/6c> <=====
   0: 00 01 add %al,(%ecx) <=====
Code; c010da26 <save_i387+2/6c>
   2: 00 d0 add %dl,%al
Code; c010da28 <save_i387+4/6c>
   4: 59 pop %ecx
Code; c010da29 <save_i387+5/6c>
   5: 05 91 d3 c0 a8 add $0xa8c0d391,%eax
Code; c010da2e <save_i387+a/6c>
   a: 0f 71 (bad)
Code; c010da30 <save_i387+c/6c>
   c: 00 00 add %al,(%eax)
Code; c010da32 <save_i387+e/6c>
   e: 00 00 add %al,(%eax)
Code; c010da34 <save_i387+10/6c>
  10: 00 00 add %al,(%eax)
Code; c010da36 <save_i387+12/6c>
  12: c0 a8 00 00 00 00 00 shrb $0x0,0x0(%eax)

1 warning issued. Results may not be reliable.

Trond Myklebust wrote:

> >>>>> " " == Fred Feirtag <ffeirtag@integrityns.com> writes:
>
> > Trond--
>
> > Helpful? Let me know if more is needed.
>
> Yup: Helpful in that it exhonerates NFS. Unfortunately it's in a part
> of the code where I'm not too familiar, namely the i387 (math
> coprocessor) save/restore code.
>
> I'd suggest trying to send the decoded Oops to the l-k list.
>
> Cheers,
> Trond

Linux version 2.4.0-test6 (root@ffeirtag) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #5 Fri Aug 11 09:45:47 MDT 2000
BIOS-provided physical RAM map:
 e820: 000000000009fc00 @ 0000000000000000 (usable)
 e820: 0000000000000400 @ 000000000009fc00 (reserved)
 e820: 0000000000010000 @ 00000000000f0000 (reserved)
 e820: 0000000000010000 @ 00000000ffff0000 (reserved)
 e820: 0000000007ef0000 @ 0000000000100000 (usable)
 e820: 000000000000d000 @ 0000000007ff3000 (ACPI data)
 e820: 0000000000003000 @ 0000000007ff0000 (ACPI NVS)
Scan SMP from c0000000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f0000 for 65536 bytes.
Scan SMP from c009fc00 for 4096 bytes.
On node 0 totalpages: 32752
zone(0): 4096 pages.
zone(1): 28656 pages.
zone(2): 0 pages.
mapped APIC to ffffe000 (01202000)
Kernel command line: auto rw root=/dev/nfs nfsroot=/diskless/%s single
console=ttyS0,9600 ramdisk_size=6144
ip=192.168.15.82:192.168.15.33:192.168.15.1:255.255.255.0:
Initializing CPU#0
Detected 601073024 Hz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 1199.31 BogoMIPS
Memory: 124248k/131008k available (1606k kernel code, 6372k reserved,
116k data, 216k init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
Buffer-cache hash table entries: 4096 (order: 2, 16384 bytes)
Page-cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
CPU: L1 I Cache: 64K L1 D Cache: 64K (64 bytes/line)
CPU: L2 Cache: 1K
CPU: AMD Duron(tm) Processor stepping 00
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.

Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
PCI: PCI BIOS revision 2.10 entry at 0xfb260, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router VIA [1106/0596] at 00:07.0
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP: Hash tables configured (established 8192 bind 8192)
ACPI: "VIA694" found at 0x000f7040
Starting kswapd v1.7
pty: 256 Unix98 ptys configured
RAMDISK driver initialized: 16 RAM disks of 6144K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
hda: Conner Technology CT204, ATA DISK drive
hdc: TOSHIBA CD-ROM XM-6702B, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hdc: ATAPI 48X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.11
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
scsi : 0 hosts.
scsi : detected total.
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 2156k freed
Serial driver version 5.01 (2000-05-29) with MANY_PORTS SHARE_IRQ
SERIAL_PCI enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.33 $ 2000/05/24 Modified by Andrey V. Savochkin

<saw@saw.sw.com.sg> and others
eth0: Intel Corporation 82557 [Ethernet Pro 100], 00:D0:B7:27:93:82, IRQ

10.
  Receiver lock-up bug exists -- enabling work-around.
  Board assembly 721383-008, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
[drm] Initialized tdfx 1.0.0 20000719 on minor 63
Via 686a audio driver 1.1.8
ac97_codec: AC97 audio codec, id: 0x4943:0x4511 (Unknown)
via82cxxx: board #1 at 0xDC00, IRQ 11
Creative EMU10K1 PCI Audio Driver, version 0.5, 17:05:16 Aug 10 2000
kmem_create: Forcing size word alignment - nfs_fh
VFS: Mounted root (ext2 filesystem).
cannot stat /varUnable to handle kernel NULL pointer
dereference/run/utmp. Plea at virtual address 0000002b
se "unset watch" printing eip:
.
c010da24
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c010da24>]
eax: 00000000 ebx: 00000000 ecx: 0000002b edx: 00000202
esi: bfff7020 edi: bfff72f4 ebp: bfff7078 esp: c7a75e80
ds: 0018 es: 0018 ss: 0018
Process linuxrc (pid: 5, stackpage=c7a75000)
Stack: c010858a bfff7078 bfff7018 c7a75fc4 c7a75fb0 c7a74000 c7a74000
00000286
       c010869f bfff7020 bfff7078 c7a75fc4 00010002 c7a73144 00000011
c7a75fb0
       c7a75fb0 c0108a69 00000011 c7a73144 c7a75fb0 c7a75fc4 00000011
c7a74000
Call Trace: [<c010858a>] [<c010869f>] [<c0108a69>] [<c0108ce2>]
[<c0114c0f>] [<c0108074>] [<c0108e2b>]
Code: 00 01 00 d0 59 05 91 d3 c0 a8 0f 71 00 00 00 00 00 00 c0 a8
Looking up port of RPC 100003/2 on 192.168.15.33
Looking up port of RPC 100005/2 on 192.168.15.33
VFS: Mounted root (nfs filesystem).
change_root: old root has d_count=4
Trying to unmount old root ... <3>error -16
Change root to /initrd: error -2
Freeing unused kernel memory: 216k freed
INIT: version 2.78 booting
Unable to handle kernel NULL pointer dereference at virtual address
0000002b
...

Let me know if I can provide more useful info.

Fred



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Aug 15 2000 - 21:00:25 EST