Oopses with 2.1.8

Philippe Strauss (philou@sicel-home-1-4.urbanet.ch)
Tue, 12 Nov 1996 17:25:27 +0100 (MET)


Hi,

I've posted some days ago oops report on my box with 2.1.8.
Nobody answered so i must be the only one seeing this.

I've not been able to run a 2.1.8 kernel without crashing it
in less than 10 minutes..

First kernels 2.1.8 were build with gcc 2.7.2.f.1 (gcc + fortran g77),
I then recompiled stock 2.7.2 in case there was some obscure bug added
in the backend of gcc by g77. Kernel compiled with stock 2.7.2 crash as well.

Doing 2 hours of memtest86 didn't find any problem with my RAM, my CPU
fan is still running.

Here is one of the latest oops:

Nov 12 15:16:52 sicel-home-1-4 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000064
Nov 12 15:16:52 sicel-home-1-4 kernel: current->tss.cr3 = 00101000, Dr3 = 00101000
Nov 12 15:16:52 sicel-home-1-4 kernel: *pde = 00000000
Nov 12 15:16:52 sicel-home-1-4 kernel: Oops: 0000
Nov 12 15:16:52 sicel-home-1-4 kernel: CPU: 0
Nov 12 15:16:52 sicel-home-1-4 kernel: EIP: 0010:[<c012fb25>]
Nov 12 15:16:52 sicel-home-1-4 kernel: EFLAGS: 00010046
Nov 12 15:16:52 sicel-home-1-4 kernel: eax: 0000000c ebx: c0275c10 ecx: 00000001 edx: c0275c10
Nov 12 15:16:52 sicel-home-1-4 kernel: esi: c1c3d0cc edi: 00000054 ebp: c0084808 esp: c01e34e0
Nov 12 15:16:52 sicel-home-1-4 kernel: ds: 0018 es: 0018 ss: 0018
Nov 12 15:16:52 sicel-home-1-4 kernel: Process swapper (pid: 0, process nr: 0, stackpage=c01e16bc)
Nov 12 15:16:52 sicel-home-1-4 kernel: Stack: c0275c10 c1c3d0cc 00000002 c0084808 c0084808 c01259a5 c0275c10 c1c3d0cc
Nov 12 15:16:52 sicel-home-1-4 kernel: c0084860 0000000a c01a6313 c1c3d0cc c1c3d0cc 00000001 00000054 00000010
Nov 12 15:16:52 sicel-home-1-4 kernel: c0084808 00000000 c01a67e9 c0084808 00000001 00000010 00000000 c0084808
Nov 12 15:16:52 sicel-home-1-4 kernel: Call Trace: [<c01259a5>] [<c01a6313>] [<c01a67e9>] [<c01a2ac4>] [<c01ae685>]
[<c01ae6aa>] [<c01ae70f>]
Nov 12 15:16:52 sicel-home-1-4 kernel: [<c01af1e1>] [<c0111680>] [<c01ad621>] [<c01b1753>] [<c010c8ca>] [<c010c391>] [<c0109654>] [<c010a7e8>]
Nov 12 15:16:52 sicel-home-1-4 kernel: [<c0109348>] [<c01091c3>] [<c01c5e5d>] [<c0116c08>]
Nov 12 15:16:52 sicel-home-1-4 kernel: Code: 66 83 7f 10 02 74 07 83 7c 24 24 00 74 0f c7 44 24 10 12 08
Nov 12 15:16:52 sicel-home-1-4 kernel: Aiee, killing interrupt handler
Nov 12 15:16:52 sicel-home-1-4 kernel: kfree of non-kmalloced memory: c01e3704, next= 00000000, order=0
Nov 12 15:16:52 sicel-home-1-4 kernel: kfree of non-kmalloced memory: c01e36f4, next= 00000000, order=0
Nov 12 15:16:52 sicel-home-1-4 kernel: kfree of non-kmalloced memory: c01e3c08, next= 00000000, order=0
Nov 12 15:16:52 sicel-home-1-4 kernel: idle task may not sleep
Nov 12 15:16:52 sicel-home-1-4 last message repeated 4 times
Nov 12 15:16:55 sicel-home-1-4 kernel: Socket destroy delayed (r=0 w=248)
Nov 12 15:17:05 sicel-home-1-4 last message repeated 3 times
Nov 12 15:17:49 sicel-home-1-4 syslogd: exiting on signal 15

went through colrm and ksymoops:

Using `/boot/SysMap/System.map.218.96.11.12.14.44' to map addresses to symbols.

>>EIP: c012fb25 <load_elf_interp+189/2e0>

Code: c012fb25 <load_elf_interp+189/2e0> cmpw $0x2,0x10(%edi)
Code: c012fb2a <load_elf_interp+18e/2e0> je c012fb33 <load_elf_interp+197/2e0>
Code: c012fb2c <load_elf_interp+190/2e0> cmpl $0x0,0x24(%esp,1)
Code: c012fb31 <load_elf_interp+195/2e0> je c012fb42 <load_elf_interp+1a6/2e0>
Code: c012fb33 <load_elf_interp+197/2e0> movl $0x90000812,0x10(%esp,1)
Code: c012fb3b <load_elf_interp+19f/2e0> nop
Code: c012fb3c <load_elf_interp+1a0/2e0> nop

objdump -D /usr/src/linux/fs/binfmt_elf.o:

00000459 <load_elf_interp+189> cmpw $0x2,0x10(%edi)
0000045e <load_elf_interp+18e> je 00000467 <load_elf_interp+197>
00000460 <load_elf_interp+190> cmpl $0x0,0x24(%esp,1)
00000465 <load_elf_interp+195> je 00000476 <load_elf_interp+1a6>
00000467 <load_elf_interp+197> movl $0x812,0x10(%esp,1)
0000046f <load_elf_interp+19f> movl 0x8(%ebx),%esi
00000472 <load_elf_interp+1a2> movl %esi,0x14(%esp,1)
00000476 <load_elf_interp+1a6> movl 0x8(%ebx),%edx

What are thos nop'ses replacing the movl?? Self modifying code? Shitty DRAM?
Something in kernel space writing stuff around?

Sometimes oops come from dcache.c (dont remember exactly where)

Could a kind soul try this .config, it may show up some light on this.

Note: I don't start GPM at startup anymore, and never launch it
with 2.1.8, (gpm-1.10)


#
# Automatically generated make config: don't edit
#

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KERNELD=y

#
# General setup
#
# CONFIG_MATH_EMULATION is not set
CONFIG_NET=y
# CONFIG_MAX_16M is not set
CONFIG_PCI=y
CONFIG_PCI_OPTIMIZE=y
CONFIG_SYSVIPC=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_JAVA is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
CONFIG_M586=y
# CONFIG_M686 is not set

#
# Floppy, IDE, and other block devices
#
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_TRITON=y
# CONFIG_IDE_CHIPSETS is not set

#
# Additional Block Devices
#
# CONFIG_BLK_DEV_LOOP is not set
# CONFIG_BLK_DEV_MD is not set
# CONFIG_BLK_DEV_RAM is not set
# CONFIG_BLK_DEV_XD is not set
# CONFIG_BLK_DEV_HD is not set

#
# Networking options
#
CONFIG_FIREWALL=y
# CONFIG_NET_ALIAS is not set
CONFIG_INET=y
CONFIG_IP_FORWARD=y
# CONFIG_IP_MULTICAST is not set
CONFIG_IP_FIREWALL=y
CONFIG_IP_FIREWALL_VERBOSE=y
CONFIG_IP_MASQUERADE=y

#
# Protocol-specific masquerading support will be built as modules.
#
# CONFIG_IP_TRANSPARENT_PROXY is not set
CONFIG_IP_ALWAYS_DEFRAG=y
CONFIG_IP_ACCT=y
# CONFIG_IP_ROUTER is not set
# CONFIG_NET_IPIP is not set

#
# (it is safe to leave these untouched)
#
# CONFIG_INET_PCTCP is not set
# CONFIG_INET_RARP is not set
# CONFIG_NO_PATH_MTU_DISCOVERY is not set
CONFIG_IP_NOSR=y
CONFIG_SKB_LARGE=y
CONFIG_IPV6=m

#
#
#
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_AX25 is not set
# CONFIG_BRIDGE is not set
# CONFIG_NETLINK is not set

#
# SCSI support
#
CONFIG_SCSI=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
CONFIG_BLK_DEV_SR=y
CONFIG_CHR_DEV_SG=y

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y

#
# SCSI low-level drivers
#
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_AM53C974 is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA_DMA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_NCR53C7xx is not set
CONFIG_SCSI_NCR53C8XX=y
# CONFIG_SCSI_NCR53C8XX_TAGGED_QUEUE is not set
# CONFIG_SCSI_NCR53C8XX_IOMAPPED is not set
CONFIG_SCSI_NCR53C8XX_MAX_TAGS=12
CONFIG_SCSI_NCR53C8XX_SYNC=10
# CONFIG_SCSI_NCR53C8XX_NO_DISCONNECT is not set
# CONFIG_SCSI_NCR53C8XX_DISABLE_MPARITY_CHECK is not set
# CONFIG_SCSI_NCR53C8XX_DISABLE_PARITY_CHECK is not set
# CONFIG_SCSI_NCR53C8XX_FORCE_SYNC_NEGO is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set

#
# Network device support
#
CONFIG_NETDEVICES=y
# CONFIG_DUMMY is not set
# CONFIG_EQUALIZER is not set
# CONFIG_DLCI is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_RADIO is not set
CONFIG_NET_ETHERNET=y
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
CONFIG_NET_ISA=y
# CONFIG_AT1700 is not set
# CONFIG_E2100 is not set
# CONFIG_DEPCA is not set
# CONFIG_EWRK3 is not set
# CONFIG_EEXPRESS is not set
# CONFIG_EEXPRESS_PRO is not set
# CONFIG_FMV18X is not set
# CONFIG_HPLAN_PLUS is not set
# CONFIG_HPLAN is not set
# CONFIG_HP100 is not set
# CONFIG_ETH16I is not set
CONFIG_NE2000=y
# CONFIG_NI52 is not set
# CONFIG_NI65 is not set
# CONFIG_SEEQ8005 is not set
# CONFIG_SK_G16 is not set
CONFIG_NET_EISA=y
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_DE4X5 is not set
# CONFIG_DEC_ELCP is not set
# CONFIG_DGRS is not set
# CONFIG_ZNET is not set
# CONFIG_NET_POCKET is not set
# CONFIG_TR is not set
# CONFIG_ARCNET is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# CD-ROM drivers (not for SCSI or IDE/ATAPI drives)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Filesystems
#
# CONFIG_QUOTA is not set
CONFIG_MINIX_FS=y
# CONFIG_EXT_FS is not set
CONFIG_EXT2_FS=y
# CONFIG_XIA_FS is not set
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
# CONFIG_UMSDOS_FS is not set
CONFIG_PROC_FS=y
CONFIG_NFS_FS=y
# CONFIG_ROOT_NFS is not set
# CONFIG_SMB_FS is not set
CONFIG_ISO9660_FS=y
# CONFIG_HPFS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_UFS_FS is not set

#
# Character devices
#
CONFIG_SERIAL=y
# CONFIG_DIGI is not set
# CONFIG_CYCLADES is not set
# CONFIG_STALDRV is not set
# CONFIG_RISCOM8 is not set
CONFIG_PRINTER=y
# CONFIG_MOUSE is not set
# CONFIG_UMISC is not set
# CONFIG_QIC02_TAPE is not set
# CONFIG_FTAPE is not set
# CONFIG_APM is not set
# CONFIG_WATCHDOG is not set
CONFIG_RTC=y

#
# Sound
#
CONFIG_SOUND=y
# CONFIG_LOWLEVEL_SOUND is not set

#
# Kernel hacking
#
# CONFIG_PROFILE is not set

-- 
Philippe Strauss, CH-1092 Belmont

Email: <philippe.strauss@urbanet.ch> Homepage: http://sicel-home-1-4.urbanet.ch