debug output after crash with 1.99.14

Michael Stiller (michael@toyland.ping.de)
Tue, 11 Jun 1996 09:55:51 +0200


Hello Alan, Linus and all others.

I already reported problems with our ISP Server machine running 1.99.14 to
vger. After examining the oops messages i decided to put the following
in /usr/src/linux/include/linux/skbuf.h:
-
#define CONFIG_SKB_CHECK 1
#define PARANOID_BUGHUNT_MODE 1
-
Some hours after rebooting into this kernel, the machine crashed again with
the following messages. Please have a look at the skbuf messages:

--
Jun 11 03:40:53 lilly kernel: Unable to handle kernel paging request at 
virtual address c90000ec
Jun 11 03:40:53 lilly kernel: current->tss.cr3 = 03129000, Pr3 = 03129000
Jun 11 03:40:53 lilly kernel: *pde = 00000000
Jun 11 03:40:53 lilly kernel: Oops: 0000
Jun 11 03:40:53 lilly kernel: CPU:    0
Jun 11 03:40:53 lilly kernel: EIP:    0010:[<00136e10>]
Jun 11 03:40:53 lilly kernel: EFLAGS: 00010206
Jun 11 03:40:53 lilly kernel: eax: 00ff5ce4   ebx: 00ff5ce4   ecx: 00ff5cdc   
edx: 00000001
Jun 11 03:40:53 lilly kernel: esi: 090000e0   edi: 090000e0   ebp: 0000005c   
esp: 01f30e80
Jun 11 03:40:53 lilly kernel: ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 
0018
Jun 11 03:40:53 lilly kernel: Process gated (pid: 116, process nr: 9, 
stackpage=01f30000)
Jun 11 03:40:53 lilly kernel: Stack: 03119a18 090000e0 090000e0 01f30eac 
001b5890 0014c44e 000000f4 00000001 
Jun 11 03:40:53 lilly kernel:        03119a18 090000e0 001c04a0 090000e0 
0014c892 001c04a0 090000e0 00000016 
Jun 11 03:40:53 lilly kernel:        00000013 00ff5de0 001c04a0 0014ca98 
001c04a0 090000e0 033e0018 00000000 
Jun 11 03:40:53 lilly kernel: Call Trace: [<0014c44e>] [<0014c892>] [<0014ca98>
] [<00140f04>] [<0014add4>] [<001353ff>] [<00135a88>] 
Jun 11 03:40:53 lilly kernel:        [<0010a3b2>] 
Jun 11 03:40:53 lilly kernel: Code: 81 7e 0c d1 de c0 de 75 0e 56 68 f2 34 1a 
00 e8 94 b2 fd ff 
Jun 11 03:40:55 lilly kernel: fcntl_setlk() called by process 140 (sendmail) 
with broken flock() emulation
Jun 11 03:40:55 lilly last message repeated 2 times
Jun 11 03:40:59 lilly kernel: File: skbuff.c Line 398, passed a non skb!
Jun 11 03:40:59 lilly kernel: skb=0358ecb4, real size=0, free=0
Jun 11 03:40:59 lilly kernel: File: skbuff.c Line 400, passed a non skb!
Jun 11 03:40:59 lilly kernel: skb=0358ecb4, real size=0, free=0
Jun 11 03:41:02 lilly kernel: Adding Swap: 51196k swap-space
Jun 11 03:41:02 lilly kernel: File: skbuff.c Line 398, passed a non skb!
Jun 11 03:41:02 lilly kernel: skb=0358ecb4, real size=0, free=0
Jun 11 03:41:02 lilly kernel: File: skbuff.c Line 400, passed a non skb!
Jun 11 03:41:02 lilly kernel: skb=0358ecb4, real size=0, free=0
Jun 11 03:41:11 lilly kernel: File: skbuff.c Line 585, control overrun
Jun 11 03:41:11 lilly kernel: skb=00ff5be8, end=03f4f637
Jun 11 03:41:30 lilly kernel: File: skbuff.c Line 585, control overrun
Jun 11 03:41:30 lilly kernel: skb=00ff5be8, end=03e2ce2f
Jun 11 03:41:50 lilly kernel: File: skbuff.c Line 479, bad next skb member
Jun 11 03:41:50 lilly kernel: skb_unlink: not a linked element
Jun 11 03:41:50 lilly kernel: double lock on device queue, lock=97 
caller=00137d83
Jun 11 03:41:50 lilly kernel: File: dev.c Line 346, bad next skb member
Jun 11 03:41:50 lilly kernel: general protection: 0000
Jun 11 03:41:50 lilly kernel: CPU:    0
Jun 11 03:41:50 lilly kernel: EIP:    0010:[<00000000>]
Jun 11 03:41:50 lilly kernel: EFLAGS: 00010202
Jun 11 03:41:50 lilly kernel: eax: ffffffff   ebx: 00000001   ecx: 0004000d   
edx: 00000000
Jun 11 03:41:50 lilly kernel: esi: 00603b00   edi: 00603a68   ebp: 00603b04   
esp: 033fcf30
Jun 11 03:41:50 lilly kernel: ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 
0018
Jun 11 03:41:50 lilly kernel: Process sendmail (pid: 140, process nr: 26, 
stackpage=033fc000)
Jun 11 03:41:50 lilly kernel: Stack: 00137b41 00603b00 00603a68 00603b00 
00603b04 00000001 fffffffe 00000001 
Jun 11 03:41:50 lilly kernel:        00000212 00137d94 00603b00 00603a68 
fffffffe 00603b00 00603b00 00603b04 
Jun 11 03:41:50 lilly kernel:        00603a68 ffffff80 00000080 001d9480 
00000212 00137c47 00603a68 00000080 
Jun 11 03:41:50 lilly kernel: Call Trace: [<00137b41>] [<00137d94>] [<00137c47>
] [<00137c5d>] [<0011618b>] [<0010a33b>] 
Jun 11 03:41:50 lilly kernel: Code: 01 00 00 00 6f ef 00 f0 c3 e2 00 f0 6f ef 
00 f0 6f ef 00 f0 
Jun 11 03:41:50 lilly kernel: Aiee, killing interrupt handler
Jun 11 03:41:51 lilly kernel: File: skbuff.c Line 479, bad next skb member
Jun 11 03:41:51 lilly kernel: skb_unlink: not a linked element
Jun 11 03:41:51 lilly kernel: double lock on device queue, lock=98 
caller=00137d83
Jun 11 03:41:51 lilly kernel: File: dev.c Line 346, bad next skb member
Jun 11 03:41:51 lilly kernel: general protection: 0000
Jun 11 03:41:51 lilly kernel: CPU:    0
Jun 11 03:41:51 lilly kernel: EIP:    0010:[<00000000>]
Jun 11 03:41:51 lilly kernel: EFLAGS: 00010202
Jun 11 03:41:51 lilly kernel: eax: ffffffff   ebx: 00000001   ecx: 0004000d   
edx: 00000000
Jun 11 03:41:51 lilly kernel: esi: 00603b00   edi: 00603a68   ebp: 00603b04   
esp: 03f9df30
Jun 11 03:41:51 lilly kernel: ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 
0018
Jun 11 03:41:51 lilly kernel: Process uucico (pid: 198, process nr: 51, 
stackpage=03f9d000)
Jun 11 03:41:51 lilly kernel: Stack: 00137b41 00603b00 00603a68 00603b00 
00603b04 00000001 fffffffe 00000001 
Jun 11 03:41:51 lilly kernel:        00000212 00137d94 00603b00 00603a68 
fffffffe 00603b00 00603b00 00603b04 
Jun 11 03:41:51 lilly kernel:        00603a68 ffffff80 00000080 001d9480 
00000212 00137c47 00603a68 00000080 
Jun 11 03:41:51 lilly kernel: Call Trace: [<00137b41>] [<00137d94>] [<00137c47>
] [<00137c5d>] [<0011618b>] [<0010a33b>] 
Jun 11 03:41:51 lilly kernel: Code: 01 00 00 00 6f ef 00 f0 c3 e2 00 f0 6f ef 
00 f0 6f ef 00 f0 
Jun 11 03:41:51 lilly kernel: Aiee, killing interrupt handler
Jun 11 03:41:51 lilly kernel: File: skbuff.c Line 479, bad next skb member
Jun 11 03:41:51 lilly kernel: skb_unlink: not a linked element
Jun 11 03:41:51 lilly kernel: double lock on device queue, lock=99 
caller=00137d83
Jun 11 03:41:51 lilly kernel: File: dev.c Line 346, bad next skb member
Jun 11 03:52:46 lilly kernel: klogd 1.3-0, log source = /proc/kmsg started.
--
The Ksymoops output to the oops in order of appearance:
I guess the first output is relevant, all others may result from the
first Problem.
-
Using `./System.map' to map addresses to symbols.

>>EIP: 136e10 <alloc_skb+80/16c> Trace: 14c44e <igmp_send_report+12/b0> Trace: 14c892 <ip_mc_inc_group+86/90> Trace: 14ca98 <ip_mc_join_group+f8/128> Trace: 140f04 <ip_setsockopt+494/5b0> Trace: 14add4 <inet_setsockopt+48/5c> Trace: 1353ff <sys_setsockopt+63/78> Trace: 135a88 <sys_socketcall+270/2dc> Trace: 10a3b2 <system_call+52/80>

Code: 136e10 <alloc_skb+80/16c> cmpl $0xdec0ded1,0xc(%esi) Code: 136e17 <alloc_skb+87/16c> jne 136e27 <alloc_skb+97/16c> Code: 136e19 <alloc_skb+89/16c> pushl %esi Code: 136e1a <alloc_skb+8a/16c> pushl $0x1a34f2 Code: 136e1f <alloc_skb+8f/16c> call fffdb2a8 <_EIP+fffdb2a8> - Using `./System.map' to map addresses to symbols.

Trace: 137b41 <do_dev_queue_xmit+169/190> Trace: 137d94 <dev_tint+44/6c> Trace: 137c47 <dev_transmit+1f/2c> Trace: 137c5d <net_bh+9/fc> Trace: 11618b <do_bottom_half+3b/60> Trace: 10a33b <handle_bottom_half+b/20>

Code: addl %eax,(%eax) Code: addb %al,(%eax) Code: outsl %ds:(%esi),(%dx) Code: outl %eax,(%dx) Code: addb %dh,%al Code: ret Code: loop 0000000b <_EIP+b> Code: lock outsl %ds:(%esi),(%dx) Code: outl %eax,(%dx) Code: addb %dh,%al Code: outsl %ds:(%esi),(%dx) Code: outl %eax,(%dx) Code: addb %dh,%al

- Using `./System.map' to map addresses to symbols.

Trace: 137b41 <do_dev_queue_xmit+169/190> Trace: 137d94 <dev_tint+44/6c> Trace: 137c47 <dev_transmit+1f/2c> Trace: 137c5d <net_bh+9/fc> Trace: 11618b <do_bottom_half+3b/60> Trace: 10a33b <handle_bottom_half+b/20>

Code: addl %eax,(%eax) Code: addb %al,(%eax) Code: outsl %ds:(%esi),(%dx) Code: outl %eax,(%dx) Code: addb %dh,%al Code: ret Code: loop 0000000b <_EIP+b> Code: lock outsl %ds:(%esi),(%dx) Code: outl %eax,(%dx) Code: addb %dh,%al Code: outsl %ds:(%esi),(%dx) Code: outl %eax,(%dx) Code: addb %dh,%al - I hope this information is useful to you to catch the bug.

Regards,

-Michael

-- 
x(f,s,c)char *s;{return f&1 ? *s ? *s-c ? x(f,++s,c) :7[s]:0:f&2 
? x(--f,"!/*,xq-ih9]c$=le&M t)r\nm@p31n%ag.8}Sdoy",c):f&4 ? *s ? 
x(f,s+1,putchar(x(f-2,"^&%!*)",*s))) : 0 : 0;}main(){return x(4,
"]!x/mhicn$!iihle&!x/mhiM$agimr%p !r@p%he&!x/mhiM !r@p%he",65);}