RE: Oops in 2.2.14pre12 (and 2.3.33, network related)

Leeuw van der, Tim (tim.leeuwvander@nl.unisys.com)
Fri, 17 Dec 1999 03:56:08 -0600


> -----Original Message-----
> From: Manfred Spraul [mailto:manfreds@colorfullife.com]
> Sent: 16 December 1999 22:05
> To: Alan Cox
> Cc: Leeuw van der, Tim; 'linux-kernel@vger.rutgers.edu'
> Subject: Re: Oops in 2.2.14pre12
>
>
> I traced the bug back to tcp_transmit_skb(): the function pointer
> tp->af_specific->queue_xmit got corrupted, and thus the CPU
> jumped to a
> bogus address. This caused an oops. The oops code itself triggered
> another oops when it tried to dump the code address.
>
> 1) What about adding safety checks before dereferencing the
> EIP pointer?
> Everything outside 0xC000 0000 and the end of the normal memory is
> obviously wrong.[Add the apropiate macros]
>
> 2) Do you have any ide what mangled the function pointer? Any critical
> changes?
>
> Tim, did you load/unload any modules immedialy prior to the oops? How
> much memory do you have?

I was not loading / unloading any modules, at least not myself - don't know
what happened behind my back.
I have 32Mb of memory, 18Mb of swap, and it was not all used (I usually load
the machine heavier, with more software running, when running a local X
server -> begone another 10Mb of RAM).

Unfortunately my situation at this moment is not such that I can mess around
too much with computers: I'm doing this at the office, connecting my laptop
to the office network and then I work on my NT workstation at the office.
Hence I have no access to serial cables or printers for any fancy oops
capturing. I'm working away from home, when I go 'home' in the evening it is
to go to the hotel. And since the laptop is currently without floppydrive,
that's not an option either... A reserved area on disk of, say, 32Kb for
capturing oopses (and perhaps any dmesg information) across reboots would be
much better.

Tomorrow I will be home again - but only to go on a 3 week vacation. :-)

I will try to catch a few more oopsen today, if I can. I will also see what
happens if I compile the kernel with -O2 instead of -O6.

And I will see if I can catch an oops from kernel 2.3.33.

--Tim

> --
> Manfred
>
> "Leeuw van der, Tim" wrote:
> >
> > Ok, thank's to Manfred I re-decoded the first OOPS that I got!
> > I mistook the <> for () - rather silly of me perhaps. But
> the output is a
> > lot more useful now.
> >
> > Here comes the oops again:
> >
> > bonsai:~ # ksymoops -m /boot22/System.map -o
> /lib/modules/2.2.14pre12/ -K -L
> > < oops.txt
> > Options used: -V (default)
> > -o /lib/modules/2.2.14pre12/ (specified)
> > -K (specified)
> > -L (specified)
> > -m /boot22/System.map (specified)
> > -c 1 (default)
> >
> > No modules in ksyms, skipping objects
> > CPU: 0
> > EIP: 0010:[<c0109259>]
> > EFLAGS: 00010046
> > eax: 00000000 ebx: 00000000 ecx: 00000000 edx: c01bf0a8
> > esi: c0100175 edi: c01d0000 ebp: c2800000 esp: c01cfbe4
> > ds: 0018 es: 0018 ss: 0018
> > Process swapper <pid: 0, process nr: 0, stackpage=c01cf000>
> > Stack: 00000000 c01cfcfc c01ded43 00000246 c1c8da00
> c01cfcfc 00000000
> > c136da20
> > c0e68a2c 0000032f 00000000 00010046 02000000
> c3000000 c01093a4
> > c01cfc68
> > c01a05f8 c01a1d0e 00000000 00000000 c010e7f0
> c01a1d0e c01cfc68
> > 00000000
> > Call Trace: [<c3000000>] [<c01093a4>] [<c01a05f8>]
> [<c01a1d0e>] [<c010e7f0>]
> > [<c01a1d0e>] [<c0108ead>]
> > [<c2861219>] [<c286e164>] [<c01559ee>] [<c0158409>]
> [<c01542b7>]
> > [<c015d8d9>] [<c015dba6>] [<c0164939>]
> > [<c0164995>] [<c0152898>] [<c016522c>] [<c01652ad>]
> [<c0161f9d>]
> > [<c01635b6>] [<c01684f3>] [<c2861ab9>]
> > [<c01687c6>] [<c0168a7e>] [<c015b582>] [<c015b806>]
> [<c0154619>]
> > [<c01183dd>] [<c010a679>] [<c0109eb8>]
> > [<c0107609>] [<c0106000>] [<c0107ca0>] [<c0108d74>]
> [<c0106000>]
> > [<c0106000>] [<c0100175>]
> > Code: 8a 04 0b 89 44 24 38 50 68 f0 05 1a c0 e8 59 a7 00 00 83 c4
> >
> > >>EIP: c0109259 <show_registers+24d/280>
> > Trace: c3000000 <END_OF_CODE+2e0917c/????>
> > Trace: c01093a4 <die+30/38>
> > Trace: c01a05f8 <error_table+974/219c>
> > Trace: c01a1d0e <error_table+208a/219c>
> > Trace: c010e7f0 <do_page_fault+2bc/384>
> > Trace: c01a1d0e <error_table+208a/219c>
> > Trace: c0108ead <error_code+2d/40>
> > Trace: c2861219 <END_OF_CODE+266a395/????>
> > Trace: c0164995 <tcp_transmit_skb+3d1/3dc>
> > Trace: c01687c6 <tcp_v4_rcv+66/3a4>
> > Trace: c0107609 <cpu_idle+a1/b4>
> > Code: c0109259 <show_registers+24d/280> 00000000 <_EIP>: <===
> > Code: c0109259 <show_registers+24d/280> 0: 8a 04 0b
> > mov (%ebx,%ecx,1),%al <===
> > Code: c010925c <show_registers+250/280> 3: 89 44 24 38
> > mov %eax,0x38(%esp,1)
> > Code: c0109260 <show_registers+254/280> 7: 50
> > push %eax
> > Code: c0109261 <show_registers+255/280> 8: 68
> f0 05 1a c0
> > push $0xc01a05f0
> > Code: c0109266 <show_registers+25a/280> d: e8
> 59 a7 00 00
> > call c01139c4 <printk+0/16c>
> > Code: c010926b <show_registers+25f/280> 12: 83 c4 00
> > add $0x0,%esp
> >
> > Aiee, killing interrupt handler
> > Kernel panic: Attempted to kill the idle task!
> > In swapper task - not synching
> >
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/