2.2.10 SMP spin_lock bug....

Robert Dinse (nanook@eskimo.com)
Wed, 4 Aug 1999 18:26:00 -0700 (PDT)


Benjamin LaHaise sent me a very small patch, as follows:

-----
Hello Robert,

If the lockup your system is triggering matches that someone else is
seeing (it looks like it =), then the following fix should work: in
kernel/fork.c around line 613, change the p->processor = NO_PROC_ID; to
p->processor = current->processor; Apparently this fix will be included
in 2.2.11.

-ben
-----
I applied this and both machines ran for about 27 hours, one is still
running OK, but the other crashed and burned with:

spin_lock (f0165514) CPU#1 stuck at f00554d8, owner PC(f0043cac):CPU3
spin_lock (f0165514) CPU#2 stuck at f00554d8, owner PC(f0043cac):CPU3
write_lock_irq (f0165514) CPU3 stuck at f0035470, owner PC(00000000):CPU(0)
reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i]
= 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000
reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i]
= 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000
reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i]
= 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000
reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i]
= 00000000 reader [i] = 00000000 reader [i] = 00000000 reader [i] = 00000000

Then there would be a few blank lines and the whole sequence would repeat.

Here is my .config:

#
# Automatically generated by make menuconfig: don't edit
#

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
# CONFIG_MODULES is not set

#
# General setup
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
# CONFIG_AP1000 is not set
CONFIG_SMP=y
# CONFIG_SUN4 is not set
# CONFIG_PCI is not set

#
# Console drivers
#
CONFIG_PROM_CONSOLE=y
CONFIG_FB=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_FB_SBUS=y
CONFIG_FB_CGSIX=y
# CONFIG_FB_BWTWO is not set
# CONFIG_FB_CGTHREE is not set
# CONFIG_FB_TCX is not set
# CONFIG_FB_CGFOURTEEN is not set
# CONFIG_FB_LEO is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FBCON_ADVANCED is not set
CONFIG_FBCON_FONTWIDTH8_ONLY=y
CONFIG_FONT_SUN8x16=y
# CONFIG_FBCON_FONTS is not set
CONFIG_SBUS=y
CONFIG_SBUSCHAR=y
CONFIG_SUN_MOUSE=y
CONFIG_SERIAL=y
CONFIG_SUN_SERIAL=y
CONFIG_SERIAL_CONSOLE=y
CONFIG_SUN_KEYBOARD=y
CONFIG_SUN_CONSOLE=y
CONFIG_SUN_AUXIO=y
CONFIG_SUN_IO=y
CONFIG_SUN_OPENPROMIO=y
CONFIG_SUN_MOSTEK_RTC=y
# CONFIG_SUN_BPP is not set
# CONFIG_SUN_VIDEOPIX is not set
# CONFIG_SUN_AURORA is not set
# CONFIG_SPARCAUDIO is not set
# CONFIG_SPARCAUDIO_AMD7930 is not set
# CONFIG_SPARCAUDIO_CS4231 is not set
# CONFIG_SPARCAUDIO_DBRI is not set
# CONFIG_SPARCAUDIO_DUMMY is not set
CONFIG_SUN_OPENPROMFS=y
CONFIG_NET=y
CONFIG_SYSVIPC=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
# CONFIG_BINFMT_JAVA is not set

#
# Floppy, IDE, and other block devices
#
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_MD=y
# CONFIG_MD_LINEAR is not set
CONFIG_MD_STRIPED=y
# CONFIG_MD_MIRRORING is not set
# CONFIG_MD_RAID5 is not set
# CONFIG_BLK_DEV_RAM is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_NBD is not set

#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_NETLINK is not set
# CONFIG_FIREWALL is not set
# CONFIG_FILTER is not set
CONFIG_UNIX=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_IP_ROUTER is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
CONFIG_IP_ALIAS=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_RARP is not set
CONFIG_SKB_LARGE=y
# CONFIG_IPV6 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_BRIDGE is not set
# CONFIG_LLC is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
# CONFIG_CPU_IS_SLOW is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# SCSI support
#
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y

#
# SCSI low-level drivers
#
CONFIG_SCSI_SUNESP=y
# CONFIG_SCSI_QLOGICPTI is not set

#
# Fibre Channel support
#
# CONFIG_FC4 is not set
# CONFIG_FC4_SOC is not set
# CONFIG_FC4_SOCAL is not set
# CONFIG_SCSI_PLUTO is not set
# CONFIG_SCSI_FCAL is not set

#
# Network device support
#
CONFIG_NETDEVICES=y
# CONFIG_DUMMY is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
CONFIG_SUNLANCE=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNBMAC is not set
# CONFIG_SUNQE is not set
# CONFIG_MYRI_SBUS is not set

#
# Unix98 PTY support
#
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256

#
# Filesystems
#
CONFIG_QUOTA=y
# CONFIG_AUTOFS_FS is not set
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
CONFIG_FAT_FS=y
# CONFIG_MSDOS_FS is not set
# CONFIG_UMSDOS_FS is not set
# CONFIG_VFAT_FS is not set
CONFIG_ISO9660_FS=y
# CONFIG_JOLIET is not set
# CONFIG_MINIX_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_HPFS_FS is not set
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
# CONFIG_QNX4FS_FS is not set
CONFIG_ROMFS_FS=y
CONFIG_EXT2_FS=y
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
# CONFIG_CODA_FS is not set
CONFIG_NFS_FS=y
CONFIG_NFSD=y
# CONFIG_NFSD_SUN is not set
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
# CONFIG_SMB_FS is not set
# CONFIG_NCP_FS is not set

#
# Partition Types
#
CONFIG_BSD_DISKLABEL=y
# CONFIG_MAC_PARTITION is not set
CONFIG_SMD_DISKLABEL=y
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
CONFIG_NLS=y

#
# Native Language Support
#
# CONFIG_NLS_CODEPAGE_437 is not set
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_1 is not set
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set

#
# Watchdog
#
CONFIG_SOFT_WATCHDOG=y

#
# Kernel hacking
#
# CONFIG_MAGIC_SYSRQ is not set

Here are some probably relevant entires from the System.map file:

f016550c D smp_activated
f0165510 D cacheflush_time
f0165514 D kernel_flag
f016551c D bitops_spinlock
f0165520 D smp_process_available

f005537c t max_select_fd
f005546c T do_select
f00556e4 T sys_select
f0055ae8 t do_poll
f0055c00 T sys_poll

f0043a64 t swap_out_process
f0043ae8 t swap_out
f0043c80 t do_try_to_free_pages
f0043d7c T kswapd
f0043ef4 T try_to_free_pages

f0035164 T exit_mm
f00351d4 t exit_notify
f00356ec T do_exit
f0035a7c T sys_exit

Also, another oddity, frequently when it gets in this state, L1-A won't
work and I have to power cycle the machine. But on those occasions where it
does work (like this time), I get the PROM monitor prompt and it responds and
interacts normally, but the messages keep printing as well. A reset or reboot
at that point restors things to normality.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/