Linux 2.1.118 SMP problem

Steffen Luitz (Steffen.Luitz@cern.ch)
Wed, 26 Aug 1998 16:12:50 +0200 (MET DST)


After booting 2.1.118-SMP on our cluster of 8 PII-266 dual processor
machines, root file system on NFS, md driver, no screen, no keyboard,
serial console output only, we are getting occasional lockups (while
copying ca. 10 MBytes from the NFS root file system to a local disk (not
an md array).

The hardware: Tahoe Tyan II Motherboard, 192 MByte RAM, 2 x Adaptec
2940UW, 1 x EEpro100.

On the serial console we get the message

wait_on_bh, CPU 0:
irq: 1 [0 1]
bh: 1 [0 1]
<[c0113c4f]> <[c0175342]> <[c0175424]> <[c0148761]>

repeating every few seconds.

System.map says:

del_timer __rpc_wake_up rpc_wake_up_task nfs_updatepage

Cheers

Steffen

+---------------------------------+
! Dr. Steffen Luitz !
! CERN / EP Division !
! CH-1211 Geneve 23 !
! e-mail: Steffen.Luitz@cern.ch ! !
+---------------------------------+

2.1.114-SMP did much better (ca. 1 crash / day / 8 machines under heavy
disk and network I/O)

Cheers

Steffen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html

Linux version 2.1.118 (root@pcna48ctl.cern.ch) (gcc version 2.7.2.3) #3
SMP Wed Aug 26 11:53:44 MEST 1998

and I got a repeating

wait_on_bh, CPU 0:
irq: 1 [0 1]
bh: 1 [0 1]
<[c0113c4f]> <[c0175342]> <[c0175424]> <[c0148761]>

on one of the PCs.

System.map says:

del_timer __rpc_wake_up rpc_wake_up_task nfs_updatepage

This seems to happen during copying 10-15 MByte of data from NFS-root to a
local disk file system. It doesn't happen all the time, at the next boot
it worked on that PC.

Cheers

Steffen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html