Hello,
>From PROBLEM: report form
[1.] Oops 2 error condition occurs in clock thread of a multi-threaded
emulator application when running on new HP hardware. The problem
is not seen with older hardware.
[2.] The clock thread is used to generate a periodic interrupt to the
emulated hardware. The same application, E11, (see www.dbit.com)
runs on many other machines, but only the new test machines see this
failure. The hardware in question is the HP Netserver LPr 700. It
looks the same as the Netserver 550 (we have many of these which
run the same distribution with no errors) but the LPr 700 has
Coppermine and the mainboard is a "new etch", according to HP. We
speculate that this problem is timing-related (no pun intended) or
hardware-related. This machine is probably not out in the general
market yet. If needs be, we can supply one.
The most annoying thing is that these machines will run under hard
load for days or weeks before this problem occurs, but once it
happens, E11 is basically fried.
[3.] Keywords: kernel; memory management.
[4.] Version: 2.2.14/smp originally from Red Hat 6.1 distribution
[5.] Oops:
-----------------------------------------------
ksym-oops output
-----------------------------------------------
bash# ./ksymoops -v ../linux/vmlinux messages
ksymoops 2.3.3 on i686 2.2.14. Options used
-v ../linux/vmlinux (specified)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.2.14/ (default)
-m /usr/src/linux/System.map (default)
No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod file?
Unable to handle kernel paging request at virtual address 0400e3d0
current->tss.cr3 = 0f2da000, %cr3 = 0f2da000
*pde = 00000000
Oops: 0002
CPU: 1
EIP: 0010:[<c0118d43>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000 ebx: 0400e3d8 ecx: 00000002 edx: 00000000
esi: cf03dfb8 edi: 0400e3d0 ebp: 00000000 esp: cf03dfac
ds: 0018 es: 0018 ss: 0018
Process mar15d (pid: 524, process nr: 25, stackpage=cf03d000)
Stack: 0809e270 0805f660 c025ac20 38db2f0a 000b2908 c0107a50 0400e3d0 00000000
00000000 000091b4 0809e270 0805f660 0000004e 00000057 0000002b 0000004e
0808b23b 00000023 00000202 080a13f0 0000002b
Call Trace: [<c0107a50>]
Code: f3 a5 8d 76 00 85 c9 75 34 85 ed 74 38 b9 08 00 00 00 b8 00
>>EIP; c0118d43 <sys_gettimeofday+43/94> <=====
Trace; c0107a50 <system_call+34/38>
Code; c0118d43 <sys_gettimeofday+43/94>
00000000 <_EIP>:
Code; c0118d43 <sys_gettimeofday+43/94> <=====
0: f3 a5 repz movsl %ds:(%esi),%es:(%edi) <=====
Code; c0118d45 <sys_gettimeofday+45/94>
2: 8d 76 00 leal 0x0(%esi),%esi
Code; c0118d48 <sys_gettimeofday+48/94>
5: 85 c9 testl %ecx,%ecx
Code; c0118d4a <sys_gettimeofday+4a/94>
7: 75 34 jne 3d <_EIP+0x3d> c0118d80 <sys_gettimeofday+80/94>
Code; c0118d4c <sys_gettimeofday+4c/94>
9: 85 ed testl %ebp,%ebp
Code; c0118d4e <sys_gettimeofday+4e/94>
b: 74 38 je 45 <_EIP+0x45> c0118d88 <sys_gettimeofday+88/94>
Code; c0118d50 <sys_gettimeofday+50/94>
d: b9 08 00 00 00 movl $0x8,%ecx
Code; c0118d55 <sys_gettimeofday+55/94>
12: b8 00 00 00 00 movl $0x0,%eax
1 warning issued. Results may not be reliable.
bash# ./ksymoops -v ../linux/vmlinux messages > report
[6.] The system runs an emulator. The emulated code dictates what the
emulator does... not possible to provide a sample because the
affected machine will run for a long time before this happens. The
clock processing thread itself is quite simple.
[7.1] See also: www.dbit.com. The application running on the emulated
hardware is network-intensive, but it runs DECnet, not TCP/IP. The
emulator uses Linux's drivers to get/put Ethernet frames to/from
drivers running under the emulated machine's O/S, which is RSX-11M/
PLUS, late of DEC, now owned by Mentec (see: www.mentec.com). The
emulator effectively uses SMP by placing disk, network, and PDP-11
CPU in different threads. We're looking forward to the next stable
SMP release :-)
[7.2] Processor Info
--------------
bash# cat cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 1
cpu MHz : 698.820797
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 pn mmx fxsr xmm
bogomips : 697.96
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 1
cpu MHz : 698.820797
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 pn mmx fxsr xmm
bogomips : 697.96
---------
[7.3] Modules: Supported, none loaded
[7.4] SCSI information: The LPr has a Symbios controller. Our kernel has
support for Adaptec 78xx and Symbios 53c8xx. Emulated disks are on
raw ("unformatted") partitions.
[7.5] Other information
Details about the emulator would come from John Wilson:
The machines discussed are not running any graphic software. The
emulated machine's (PDP-11) console is running off serial port A
assigned to /dev/ttyS0. The Linux console is idle, with no telnet
access (network running under RSX-11M/Plus's DECnet implementation).
After failure there are two remaining E11 threads in this list,
the mar15d ones; the main emulator is QUIT out:
--------------
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1104 72 ? S Mar17 0:11 init [3]
root 2 0.0 0.0 0 0 ? SW Mar17 0:05 [kflushd]
root 3 0.0 0.0 0 0 ? SW Mar17 0:49 [kupdate]
root 4 0.0 0.0 0 0 ? SW Mar17 0:00 [kpiod]
root 5 0.0 0.0 0 0 ? SW Mar17 0:15 [kswapd]
bin 274 0.0 0.0 1196 0 ? SW Mar17 0:00 [portmap]
root 329 0.0 0.0 1152 56 ? S Mar17 0:01 syslogd -m 0
root 340 0.0 0.0 1396 228 ? S Mar17 0:00 klogd
daemon 356 0.0 0.0 1128 104 ? S Mar17 0:00 /usr/sbin/atd
root 372 0.0 0.0 1304 108 ? S Mar17 0:01 crond
root 392 0.0 0.0 1124 0 ? SW Mar17 0:00 [inetd]
root 408 0.0 0.0 1176 0 ? SW Mar17 0:00 [lpd]
root 446 0.0 0.1 2104 284 ? S Mar17 0:03 sendmail: accepti
root 463 0.0 0.0 1132 136 ? S Mar17 0:00 gpm -t ps/2
xfs 480 0.0 0.0 1880 60 ? S Mar17 0:00 xfs -droppriv -da
root 514 0.0 0.2 1704 648 tty1 S Mar17 0:00 /bin/bash --
root 515 0.0 0.0 1696 0 tty2 SW Mar17 0:00 [bash]
root 516 0.0 0.0 1076 0 tty3 SW Mar17 0:00 [mingetty]
root 517 0.0 0.0 1076 0 tty4 SW Mar17 0:00 [mingetty]
root 518 0.0 0.0 1076 0 tty5 SW Mar17 0:00 [mingetty]
root 519 0.0 0.0 1076 0 tty6 SW Mar17 0:00 [mingetty]
root 520 0.0 0.1 1672 268 ttyS0 S Mar17 0:00 bash /etc/starte1
root 526 0.1 1.7 4720 4488 ttyS0 S< Mar17 18:35 [mar15d]
root 527 0.0 1.7 4720 4488 ttyS0 S< Mar17 1:06 [mar15d]
root 3917 0.0 0.3 1696 896 ttyS0 S 07:24 0:00 bash
root 3921 0.0 0.3 2672 960 ttyS0 R 07:25 0:00 ps -aux
bash#
[9] Other
Please inform if HP Engineering should be involved.
Thank you,
John Shilling
System Consultant
5404 Knoxville Drive
College Park, MD 20740
(301) 441-2357
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:17 EST