[PROBABLY SOLVED] aic7xxx problems: should I upgrade to 2.0.32

Eloy A. Paris (Eloy.Paris@ven.ra.rockwell.com)
Wed, 19 Nov 1997 03:13:14 -0400


Hello all,

first of all, thanks to everyone that helped with this problem. I really
apppreciate everyone's help.

To refresh your memory, this is the problem I am having:

>one of our production servers was working fine with a NCR53c810 SCSI
>controller but because of a power supply failure (after 75 days of up
>time) we ended replacing the old server with a new machine with an
>AHA-2940 Ultra SCSI adapter. The old NCR53c810 didn't have any
>problems handling the two SCSI disks and the H-P SureStore 2000 tape
>drive.
>
>The aic7xxx, however is having problems with the tape drive (SCSI
>disks are just fine). When I do a "tar cf /dev/st0 /home", for
>example, I get the kernel error messages below.

And these are the options I was considering:

>1) Should I upgrade to 2.0.32 to see if the tape problems are solved
>with the latest version of the aic7xxx driver?
>
>2) Can I stay with 2.0.30 but with the latest version of the aic7xxx
>driver? Is this something bad? I am asking because 2.0.30 is rock
>solid for me, and I don't like the horror stories I've heard about
>2.0.31.
>
>3) Where's the latest version of the aicxxx driver?

OK, some of you won't believe this but the timeout problems I am having
with my AHA-2940U and my SureStore DAT tape drive seem to have gone away.
Today I was able to do a 1.7 GByte backup to tape with no problems and not
a single message was written to the kernel log. I had no SCSI timeouts as
far as I know.

Jean-Francois Micouleau suggested me to check active termination of the
SCSI bus and that term-power was provided by the AHA only. I was going to
do that before upgrading from 2.0.30 to 2.0.32 but I changed my mind and
upgraded first, without double checking everything, and it worked!!!

While the tape backup was being done, I tried to do several things to try
to crash the kernel but none of them worked. For example, I tried "find /
-type f -exec cp {} /dev/null \;" to generate a lot of SCSI activity.

While backing up 1.7 GBytes to tape with BRU, process that involves a
werite followed by a verification, the system was 100% usable. Whith 2.0.30
the high CPU load made (probably because of the timeouts and retries) the
system almost unusable.

I checked termination of my SCSI bus and unless the Barracuda hard disk
does not have active termination (which I think is not true, according to
the manual), everything is fine.

Doug also suggested that SCSI disconnetion for my tape drive was disabled
in the AHA BIOS. I doubled checked that as well and it is not disabled: all
SCSI devices are allowed to disconnect. Sync. negotiation is enabled for
all devices as well.

I did not enable any of the compile time options like "tagged commands" for
the aic7xxx.

Does this make sense to you (that my problems went away with 2.0.32)?
Please let me know if you want me to test further.

Thank you all.

E.-