Re: test4 and test5 fail to boot for me (kswap related)...

From: jeff@ntcor.com
Date: Fri Jul 28 2000 - 20:15:19 EST


Sorry about that. New to this reporting stuff (obviously.) Please somebody
help me if you can...

In response to Mr Larsson's teachings I present the following:

1) short: kswapd v1.7 dereferences a NULL pointer during boot of linux-2.4.0-test5
2) long description: same. (its a simple oops during boot)
   Just a couple of notable items:
     I did *not* compile the kernel with high-mem support.
     All output included here is generated on the exact same machine rebooted
     with the linux-2.2.16-8 kernel provided with RedHat rawhide.
     compiler is gcc 2.96 on this machine.
3) keywords: linux-2.4.0-test5, test5, kswapd, NULL
4) kernel version: linux-2.4.0-test5
5) oops output:

Unable to handle NULL pointer dereference at virtual address 00000000 printing eip:
c0117164
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0117164>]
EFLAGS: 00010092
eax: c14a6000 ebx: 00000023 ecx: 00000000 edx: c021b538
esi: 00000000 edi: c14a6222 ebp: c14a7fcc esp: c14a7fa4
ds: 0018 es: 0018 ss:0018
Process kflushd (pid: 3, stackpage=c14a7000)
Stack: 00000000 c011cd6d 00000246 00000000 c14bdf48 c14a6000 00000023 c14a6000
       c14a6000 c14a6222 0008e000 c0109300 c14bdf3c ffffffff c14a6000 c01abd17
       00000f00 c14bdf70 c01318a0 c0108c2b c14bdf3c 00000078 c01ebfc0
Call Trace: [<c011cd6d>] [<c0109300>] [<c01abd17>] [<c01318a0>] [<c0108c2b>]
Code: 8b 01 89 45 dc 21 d8 a9 df ff ff ff 0f 84 f8 00 00 00 8b 55

Sorry but I tried resolving this spaghetti with ksymoops but got this instead:

[root@stingray linux]# ksymoops -v /usr/src/linux/vmlinux -K -o /lib/modules/2.4.0-test5 -L < /home/jeff/oops
ksymoops 0.7c on i686 2.2.16-8. Options used
     -v /usr/src/linux/vmlinux (specified)
     -K (specified)
     -L (specified)
     -o /lib/modules/2.4.0-test5 (specified)
     -m /usr/src/linux/System.map (default)

No modules in ksyms, skipping objects
Fatal Error (re_compile): on '^((Stack: )|(Stack from )|([0-9a-fA-F]{4,})|(Call Trace: )|(\[*<([0-9a-fA-F]{4,})>\]* *)|(Version_[0-9]+)|(Trace: )|(Call backtrace:)|(bh:)|<\[([0-9a-fA-F]{4,})\]> *|Process .*stackpage=|Code *: |Kernel panic|In swapper task|Corrupted stack page|invalid operand: |Oops:
|Cpu: +[0-9]|current->tss|\*pde +=|EIP: |EFLAGS: |eax: |esi: |ds: |CR0: |wait_on_|irq: |pc[:=]|68060 access|Exception at |d[04]: |Frame format=|wb [0-9] stat|push data: |baddr=|Arithmetic fault|Instruction fault|Bad unaligned kernel|Forwarding unaligned exception|: unhandled unaligned exception|pc
*=|r[0-9]+ *=|gp *=|spinlock stuck|tsk->|PSR: |[goli]0: |Instruction DUMP: |TSTATE: |[goli]4: |Caller\[|CPU\[|MSR: |TASK = |last math |GPR[0-9]+: |\$[0-9 ]+:|epc |Status:|Cause :|Backtrace:|Function entered at|\*pgd =|Internal error|pc :|sp :|r[0-9][0-9 ]:|Flags:|Control:)' - Invalid range end

Which didn't seem too useful since it looks like ksymoops dies for some reason.
but doing this...

cat - > debug.c << EOF
char str[] = "\x8b\x01\x89\x45\xdc\x21\xd8\xa9\xdf\xff\xff\xff\x0f\x84\xf8\x00\x00\x00\x8b\x55";
main(){}
EOF

gcc -g debug.c

[jeff@stingray jeff]$ gdb a.out
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble str
Dump of assembler code for function str:
0x80494a4 <str>: mov (%ecx),%eax
0x80494a6 <str+2>: mov %eax,0xffffffdc(%ebp)
0x80494a9 <str+5>: and %ebx,%eax
0x80494ab <str+7>: test $0xffffffdf,%eax
0x80494b0 <str+12>: je 0x80495ae
0x80494b6 <str+18>: mov 0x0(%ebp),%edx
0x80494b9 <str+21>: add %al,(%eax)
0x80494bb <str+23>: add %al,(%eax)
End of assembler dump.
(gdb) quit

I think that probably helps more for somebody. I can't make much sense of it.
except that is looks like a conditional test. I mean how many conditional tests
could possibly be in the kernel?? ;-)

I can not provide a shell script (occurs during boot before file systems
get mounted)

Here's the "environment" that I can give you...

here's the process info:
[jeff@stingray jeff]$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 5
model name : Pentium II (Deschutes)
stepping : 2
cpu MHz : 225.516
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr
bogomips : 448.92

here's the memory setup:
[jeff@stingray jeff]$ free
             total used free shared buffers cached
Mem: 257692 171964 85728 112016 16292 69772
-/+ buffers/cache: 85900 171792
Swap: 263160 0 263160

Sorry about this next blasting but REPORTING-BUGS says to include it...
The PCI info...

[root@stingray linux]# /sbin/lspci -vvv | more
00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 02)
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Latency: 64
        Region 0: Memory at f8000000 (32-bit, prefetchable)
        Capabilities: [a0] AGP version 1.0
                Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
                Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>

00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 02) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
        I/O behind bridge: 0000d000-0000dfff
        Memory behind bridge: fda00000-feafffff
        Prefetchable memory behind bridge: f3800000-f58fffff
        BridgeCtl: Parity- SERR- NoISA- VGA+ MAbort- >Reset- FastB2B+

00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0

00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 [Master])
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Region 4: I/O ports at ffa0 [disabled]

00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 [UHCI])
        Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin D routed to IRQ 0
        Region 4: I/O ports at ef80

00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
        Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-

00:10.0 SCSI storage controller: Adaptec AHA-2940U2/W
        Subsystem: Adaptec: Unknown device a180
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (9750ns min, 6250ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        BIST result: 00
        Region 0: I/O ports at e800
        Region 1: Memory at febff000 (64-bit, non-prefetchable)
        Expansion ROM at febc0000 [disabled]
        Capabilities: [dc] Power Management version 1
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:11.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
        Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (2500ns min, 2500ns max), cache line size 08
        Interrupt: pin A routed to IRQ 9
        Region 0: I/O ports at ec00
        Region 1: Memory at febfef80 (32-bit, non-prefetchable)
        Expansion ROM at feba0000 [disabled]
        Capabilities: [dc] Power Management version 1
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:12.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
        Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (2500ns min, 2500ns max), cache line size 08
        Interrupt: pin A routed to IRQ 15
        Region 0: I/O ports at e480
        Region 1: Memory at febfef00 (32-bit, non-prefetchable)
        Expansion ROM at feb80000 [disabled]
        Capabilities: [dc] Power Management version 1
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200 AGP (rev 01) (prog-if 00 [VGA])
        Subsystem: Matrox Graphics, Inc. Millennium G200 AGP
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (4000ns min, 8000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at f4000000 (32-bit, prefetchable)
        Region 1: Memory at feafc000 (32-bit, non-prefetchable)
        Region 2: Memory at fe000000 (32-bit, non-prefetchable)
        Expansion ROM at feae0000 [disabled]
        Capabilities: [dc] Power Management version 1
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [f0] AGP version 1.0
                Status: RQ=31 SBA+ 64bit- FW- Rate=x1,x2
                Command: RQ=31 SBA+ AGP+ 64bit- FW- Rate=x2

[root@stingray linux]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: SEAGATE Model: ST39103LW Rev: 0002
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: SEAGATE Model: ST136403LW Rev: 0002
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 02 Lun: 00
  Vendor: SEAGATE Model: ST39173W Rev: 6244
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 03 Lun: 00
  Vendor: Quantum Model: XP34300W Rev: L912
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: iomega Model: jaz 2GB Rev: E.17
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 05 Lun: 00
  Vendor: PIONEER Model: CD-ROM DR-U06S Rev: 1.05
  Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: YAMAHA Model: CRW6416S Rev: 1.0b
  Type: CD-ROM ANSI SCSI revision: 02

Roger Larsson wrote:
>
> Hi,
>
> You forgot to include some important info...
>
> * like memory configuration (little/much/highmem)
> * ... (see linux/REPORTING-BUGS)
>
> /RogerL
>
> jeff@ntcor.com wrote:
> >
> > Using RawHide (probably adds to the problem) with GCC 2.96
> >
> > When I compile test4 everything compiles fine but when I boot
> > the test 4 kernel it hangs during the boot sequence just after
> > printing:
> > Starting kswapd v1.6
> >
> > when I compile test5 everything compiles but the kernel dies
> > during the boot sequence doo to a dereferenced NULL pointer
> > just after kswap v1.7 is started.
> >
> > It looks like kswap is undergoing some changes. I'm not able
> > to fix them but I'll add my $0.02 to indicate that is either
> > a) having infinite loops in 1.6 or
> > b) memory usage problems in 1.7
> >
> > - Jeff
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.rutgers.edu
> > Please read the FAQ at http://www.tux.org/lkml/
>
> --
> Home page:
> http://www.norran.net/nra02596/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:29 EST