Re: 2.2.16 crashes on ES40 (with spinlock messages...)

From: Martin Frey (frey@scs.ch)
Date: Mon Jul 24 2000 - 11:36:01 EST


Hi,

on our ES40 2.2.14, NFS mounted directory I get in syslog:
/.nfs0000000021364c6b0000000f, error=-13
NFS: can't silly-delete mqueue/.nfs0000000021358c8700000015, error=-13
NFS: can't silly-delete tmp/.nfs0000000021364c6f0000001a, error=-13
NFS: can't silly-delete tmp/.nfs0000000021364c710000001c, error=-13
NFS: can't silly-delete tmp/.nfs0000000021364c730000001e, error=-13
NFS: can't silly-delete mqueue/.nfs0000000021358c8900000028, error=-13
NFS: can't silly-delete tmp/.nfs0000000021364c700000001b, error=-13
NFS: can't silly-delete tmp/.nfs0000000021364c720000001d, error=-13
NFS: can't silly-delete tmp/.nfs0000000021364c740000001f, error=-13
NFS: can't silly-delete mqueue/.nfs0000000021358c8c0000002b, error=-13

The test4_out.1 file says:
 
The e-mail address is frey
The hostname of this machine is es0.scs.ch
The architecture of this machine is ALPHA
The process id is 873
The executable is szin_w-11-2.ALPHA
The reader is read-szin-11-15
 
Time = map_shmem: error in shmget: Invalid argument
test4_a : Error in szin_w-11-2.ALPHA : program crashed, status=1

Exaclty the same (including the kernel messages) comes when
running on a local SCSI disk with ext2 filesystem. What's
wrong?

Andrew Pochinsky wrote:
>
> Hi,
>
> Some time ago I posted a message to this list about misterious crashes
> on Alpha ES40. Peter Rival, Pat O'Rourke and Michal Jaegermann made
> some interesting suggestions for a possible cause. Unfortunately, I
> was not able to fix the problem. This time, however, our user built
> the code which reliably crashes the system after a few second run.
> Each crash is accompanied by a 'spinlock ... stuck' message repeated
> for every processor in the system. Once all the processors are stuck,
> the system goes catatonic. To check that the problem is not related to
> some flaky hardware, I rebooted the same kernel with nosmp flag. The
> problem is gone (of course, the machine now runs four times slower ;(
>
> --andrew
>
> P.S. The tarball of the executable could be found at
> <ftp://ftp.lns.mit.edu/pub/avp/smp-crash.tar.gz>. Simply start runme
> and wait.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-- 
Supercomputing Systems AG        email: frey@scs.ch	
Martin Frey                      www:   http://www.scs.ch/~frey
Technoparkstrasse 1		 phone: +41 (0)1 445 16 00
CH-8005 Zurich			 fax:	+41 (0)1 445 16 10

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:17 EST