2.4.0-test[1-4]: SCSI-hangs

From: Adrian Reyer (Adrian.Reyer@RUS.Uni-Stuttgart.DE)
Date: Sun Jul 16 2000 - 17:47:26 EST


Hi,

When I put a bit of load on my SCSI-devices, the box hangs, I get lots
of messages in the syslog reading like this:
Jul 16 21:40:12 prompt kernel: scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 10, lun 0 Write (10) 00 00 80 01 d7 00 00 08 00
At the end, I get a message saying:
Jul 16 21:40:13 prompt kernel: (scsi0:0:10:-1) timer: Yikes, loop in delayed_scbs list.

The box is a Dual-Celeron-500, 256MB RAM, Asus P2B-DS-Board, onboard
AHA-2940, 2*9GB (ID 0, ID 1), 1*18GB (ID 10), the 9GB-disks are stuck
together as a raid1.
'Hangs' in this case means:
 - no more disk-IO
 - all files already in the cache seem to be fine
 - processes accessing disks are unkillable
I tried combinations als follows:
2.2.16, plain: all seems fine
2.2.16, raid1+ext2: all seems fine
2.2.16, raid1+reiserfs: all seems fine
2.4.0-test1, plain: Hangs, if I try and do e.g.
                    tar xvzf linux-2.4.0-test1.tar.gz
2.4.0-test1, ext2 on raid1: Hangs, same situation
    After a while, the SCSI-bus gets resettet and the
    tar/rsync/whatever I use continues.
2.4.0-test1, reiserfs on raid1: Hangs, same situation, but no
    SCSI-reset, instead the Yikes mentioned above.
Actually, I tried it vice versa, first raid1+reiserfs, then going
back.
If make a list of all directories on one disk (1GB data), sort it in
reverse order and rsync every single directory recursivly to the
software-raid1, all works fine, despite the last directory of course
contains all others. If I just rsync the top-level directories, the
system hangs.

First I thought about a hardware-problem with the 18GB-diskm but the
problem occurs with all SCSI-disks. It's all U2W, U2W-disks,
U2W-cables, U2W-Terminators, one internal in the AHA2940, one at the
end of the SCSI-cable. None of the disks is terminated itself.
The 9GB-disks are new, so is the board, the 18GB-disk is 4 month
old.

If anyone needs further information, tha machine is availiable for
testing at least for the next 5 days.

Bye,
    Heron

-- 
                                 Adrian.Reyer@rus.uni-stuttgart.de
finger rusheron@helpdesk.rus.uni-stuttgart.de for PGP-Key
Key fingerprint = 70 7E 09 95 3A 43 63 39  B4 20 62 1C 82 A8 45 3A

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jul 23 2000 - 21:00:08 EST