2.0.33 oops, non-fatal, comments anyone?

Chris Evans (chris@ferret.lmh.ox.ac.uk)
Wed, 25 Mar 1998 03:22:43 +0000 (GMT)


Hi,

Just had an oops on our server. Thing is still alive and processing away
merrily. 2nd problem we've had on 2.0.33; 2.0.32 was fine for in excess of
100 days of uptime. Before that, 2.0.32pre2 was fine for in excess of 100
days of uptime.

Here's the oops:

general protection: 0000
CPU: 0
EIP: 0010:[<001115c4>]
EFLAGS: 00010212
eax: 6362674a ebx: 01ce8018 ecx: 6362674a edx: 64636364
esi: 00000717 edi: 096c0026 ebp: 018afe80 esp: 018afe58
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process unixded (pid: 22126, process nr: 45, stackpage=018af000)
Stack: 00000000 00000000 00000000 096c0026 02179cb8 001b8d04 00002712 00000293
0012cf29 00000003 00000000 0012d031 00000001 00000000 bffffcac 00000000
02712000 00000000 02712000 0012d29b 00000000 018aff54 018aff14 018afed4
Call Trace: [<0012cf29>] [<0012d031>] [<0012d29b>] [<0014d9ac>] [<0014130b>] [<0011095f>] [<0010a625>]
Code: 83 ba 2c 01 00 00 00 74 0f 8b 82 30 01 00 00 05 e8 03 00 00
task not on run-queue

And lets have a look at what those EIP and call trace thingies are shall
we?

Mar 25 02:56:55 ferret kernel: EIP: 0010:[schedule+384/652]
Mar 25 02:56:56 ferret kernel: Call Trace: [do_select+133/484]
[do_select+397/484] [sys_select+387/596] [udp_rcv+956/976]
[ip_rcv+1091/1396] [old_select+63/80] [system_call+85/124]

And some code...

Code: 00000000 <_EIP> cmpl $0x0,0x12c(%edx)
Code: 00000007 <_EIP+7> je 00000018 <_EIP+18>
Code: 00000009 <_EIP+9> movl 0x130(%edx),%eax
Code: 0000000f <_EIP+f> addl $0x3e8,%eax
Code: 00000014 <_EIP+14>

%edx doesn't look like an amazingly deferenceable pointer to me.....

This oops has me rather nervous, hot on the heels of a wait queue
corruption that bought this machine down a couple of weeks ago. Looking at
patch-2.0.33.gz, I see there are some wait-queue related changes. Hmmmm.

One final data point: I'm using 5.0.7 of Doug's aic7xxx patches. Doug,
have you changed the wait queue handling in your patches, and if so are
they likely to accidentally stomp on wait queues?

Cheers
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu