RE: aic7xxx problems: should I upgrade to 2.0.32?

Doug Ledford (dledford@dialnet.net)
Tue, 18 Nov 1997 18:34:49 -0600 (CST)


On 18-Nov-97 Eloy A. Paris wrote:
>The aic7xxx, however is having problems with the tape drive (SCSI
>disks are just fine). When I do a "tar cf /dev/st0 /home", for
>example, I get the kernel error messages below.
>
>My questions are:
>
>1) Should I upgrade to 2.0.32 to see if the tape problems are solved
>with the latest version of the aic7xxx driver?

Looking at the error messages, this looks like a configuration problem, not
a general driver problem. More on that at the end.

>2) Can I stay with 2.0.30 but with the latest version of the aic7xxx
>driver? Is this something bad? I am asking because 2.0.30 is rock
>solid for me, and I don't like the horror stories I've heard about
>2.0.31.

2.0.32 should be fine as far as those horror stories are concerned. So far,
the only things truly raised about this kernel are things that are either
true of earlier kernels as well, or are very driver specific. The latest
aic7xxx code will not easily go into 2.0.30. Too much in the SCSI drivers
directory in general has changed (like the Makefile for one thing).

>
>3) Where's the latest version of the aicxxx driver?

The latest *official* version is in the 2.0.32 kernel. The latest
unofficial patches of mine are in the same place as always, ftp.dialnet.net
in /pub/linux/aic7xxx.

[ snipped error messages ]

The important thing to note here is that the system didn't time out a
command to the tape drive, it timed out a command to the disk drive. This
usually indicates that the tape device is configured in the Adaptec EZ-SCSI
BIOS with the disconnect priviledge disabled. This means that when the
backup is finished and issues the write filemark command followed by the
rewind command, the tape drive can *NOT* disconnect from the scsi controller
while these commands are completing. Since the rewind command for one can
take a *very* significant amount of time, this causes other commands on the
SCSI bus to time out. When those commands time out, the bus gets reset.
This, of course, breaks the I_T_L nexus between the tape drive and the
controller, and since the tape drive is in the middle of an operation that,
if not completed, leaves the tape in a somewhat indeterminate state, it
doesn't respond to the controller again until it completes the command (or
completed the reset phase, which can be quite long on DAT drives like you
have). In the meantime, the kernel has tried to re-establish contact with
the tape drive and failed, causing an error condition to result.

----------------------------------
E-Mail: Doug Ledford <dledford@dialnet.net>
Date: 18-Nov-97
Time: 18:34:51
----------------------------------