Re: aic7xxx and tapse (Was Re: How to force a kernel panic ?)

Richard B. Johnson (root@analogic.com)
Mon, 13 Jan 1997 08:52:30 -0500 (EST)


On Mon, 13 Jan 1997, James V. Di Toro III wrote:

> On Sat, 11 Jan 1997, Richard B. Johnson wrote:
>
> > I have no problem making my kernel panic. I use the aic7xxx SCSI driver.
> > I just try to use a SCSI tape-drive <death>. Maybe someone will fix this
> > sometime so I won't have to backup across the network to another machine
> > that uses the same tape drive sucessfully.
>
> What's the deal on this? I recently installed a AHA-2940 Ultra
> and a Python 28388-XXX in a system, and have not had any problems.
>
I have the AHA-2940 "Ultra", plus two Quantum XP32150W disks, plus a
Toshiba XM-3601 CD/ROM. As long as I don't try to use a mounted MS-DOS
file system or attempt to us my EXABYTE EXB-8200. Everything works if
I replace the controller with an older Adaptec 1542. However, this system
was built (at considerable cost) to be the "Ultimate" Linux machine. With
the hardware I coersed the company to pay for, I could have gotten a SGI,
Super Alpha, or top-of-the line Sun.

Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sdb1 893986 723201 124600 85% /
/dev/sdb3 1165950 51553 1054154 5% /home/users
/dev/sda1 554048 92688 461360 17% /dos/drive_C
/dev/sda5 417432 68784 348648 16% /dos/drive_D

The performance is excellent, otherwise I would be screaming. I hope that
sometime somebody finds the problem with the aic7xxx driver.

If I execute:
# mt status
The result is ....

Jan 9 17:35:27 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:27 chaos kernel: scsi0 channel 0 : resetting for second half of retries.
Jan 9 17:35:27 chaos kernel: SCSI bus is being reset for host 0 channel 0.
Jan 9 17:35:27 chaos kernel: aic7xxx: (reset) target/channel 5/0
Jan 9 17:35:27 chaos kernel: aic7xxx: (reset_device) target/channel -1/A, active_scb 2
Jan 9 17:35:27 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:27 chaos kernel: aic7xxx: (reset_device) aborting SCB 1, TCL=1/0/0
Jan 9 17:35:27 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:27 chaos kernel: aic7xxx: (reset_channel) Resetting current channel A
Jan 9 17:35:27 chaos kernel: aic7xxx: (reset_channel) Channel reset, sequencer restarted
Jan 9 17:35:27 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 1, TCL=1/0/0
Jan 9 17:35:27 chaos kernel: scsi0: Target 1, channel A, now synchronous at 10.0MHz, offset 15.
Jan 9 17:35:27 chaos kernel: scsi0: Target 5, channel A, refusing synchronous negotiation; using asynchronous transfers.
Jan 9 17:35:27 chaos kernel: net_bh: too many loops, dropping...
Jan 9 17:35:28 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:29 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:29 chaos kernel: st0: Error 27070008.
Jan 9 17:35:48 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:49 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:49 chaos kernel: scsi0 channel 0 : resetting for second half of retries.
Jan 9 17:35:49 chaos kernel: SCSI bus is being reset for host 0 channel 0.
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset) target/channel 5/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) target/channel -1/A, active_scb 4
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 6, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 3, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 2, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 1, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 5, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 7, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_device) aborting SCB 0, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:35:49 chaos last message repeated 6 times
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_channel) Resetting current channel A
Jan 9 17:35:49 chaos kernel: aic7xxx: (reset_channel) Channel reset, sequencer restarted
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 0, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 1, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 2, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 3, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 5, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 6, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 7, TCL=1/0/0
Jan 9 17:35:49 chaos kernel: scsi0: Target 1, channel A, now synchronous at 10.0MHz, offset 15.
Jan 9 17:35:49 chaos kernel: scsi0: Target 5, channel A, refusing synchronous negotiation; using asynchronous transfers.
Jan 9 17:35:51 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:51 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:35:51 chaos kernel: st0: Error 27070008.
Jan 9 17:36:05 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:36:05 chaos last message repeated 4 times
Jan 9 17:36:05 chaos kernel: st0: Error 27070008.
Jan 9 17:36:09 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:36:09 chaos last message repeated 4 times
Jan 9 17:36:09 chaos kernel: st0: Error 27070008.
Jan 9 17:36:39 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:36:40 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:36:40 chaos kernel: scsi0 channel 0 : resetting for second half of retries.
Jan 9 17:36:40 chaos kernel: SCSI bus is being reset for host 0 channel 0.
Jan 9 17:36:40 chaos kernel: aic7xxx: (reset) target/channel 5/0
Jan 9 17:36:40 chaos kernel: aic7xxx: (reset_device) target/channel -1/A, active_scb 4
Jan 9 17:36:40 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:36:40 chaos kernel: aic7xxx: (reset_device) aborting SCB 2, TCL=1/0/0
Jan 9 17:36:40 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:36:40 chaos kernel: aic7xxx: (reset_channel) Resetting current channel A
Jan 9 17:36:40 chaos kernel: aic7xxx: (reset_channel) Channel reset, sequencer restarted
Jan 9 17:36:40 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 2, TCL=1/0/0
Jan 9 17:36:40 chaos kernel: scsi0: Target 1, channel A, now synchronous at 10.0MHz, offset 15.
Jan 9 17:36:40 chaos kernel: scsi0: Target 5, channel A, refusing synchronous negotiation; using asynchronous transfers.
Jan 9 17:36:50 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:36:50 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:36:50 chaos kernel: st0: Error 27070008.

Note that this was just "status", I didn't try to read/or write to the
device.

In the following I do:

# od -x /dev/st0

Jan 9 17:43:21 chaos kernel: SCSI bus is being reset for host 0 channel 0.
Jan 9 17:43:21 chaos kernel: aic7xxx: (reset) target/channel 5/0
Jan 9 17:43:21 chaos kernel: aic7xxx: (reset_device) target/channel -1/A, active_scb 7
Jan 9 17:43:21 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:43:21 chaos kernel: aic7xxx: (reset_device) aborting SCB 0, TCL=1/0/0
Jan 9 17:43:21 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:43:21 chaos kernel: aic7xxx: (reset_channel) Resetting current channel A
Jan 9 17:43:21 chaos kernel: aic7xxx: (reset_channel) Channel reset, sequencer restarted
Jan 9 17:43:21 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 0, TCL=1/0/0
Jan 9 17:43:21 chaos kernel: scsi0: Target 1, channel A, now synchronous at 10.0MHz, offset 15.
Jan 9 17:43:21 chaos kernel: scsi0: Target 5, channel A, refusing synchronous negotiation; using asynchronous transfers.
Jan 9 17:43:22 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:43:23 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:43:23 chaos kernel: st0: Error 27070008.

In the following I do:

# tar -clf /dev/st0 .

Jan 9 17:47:12 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:47:13 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:47:13 chaos kernel: scsi0 channel 0 : resetting for second half of retries.
Jan 9 17:47:13 chaos kernel: SCSI bus is being reset for host 0 channel 0.
Jan 9 17:47:13 chaos kernel: aic7xxx: (reset) target/channel 5/0
Jan 9 17:47:13 chaos kernel: aic7xxx: (reset_device) target/channel -1/A, active_scb 1
Jan 9 17:47:13 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:47:13 chaos kernel: aic7xxx: (reset_device) aborting SCB 3, TCL=1/0/0
Jan 9 17:47:13 chaos kernel: aic7xxx: (match_scb) comparing target/channel -1/A to scb 1/A
Jan 9 17:47:13 chaos kernel: aic7xxx: (reset_channel) Resetting current channel A
Jan 9 17:47:13 chaos kernel: aic7xxx: (reset_channel) Channel reset, sequencer restarted
Jan 9 17:47:13 chaos kernel: aic7xxx: (done_aborted_scbs) Aborting scb 3, TCL=1/0/0
Jan 9 17:47:13 chaos kernel: scsi0: Target 1, channel A, now synchronous at 10.0MHz, offset 15.
Jan 9 17:47:23 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:47:23 chaos kernel: scsi0: Target busy, TCL=0x50.
Jan 9 17:47:23 chaos kernel: st0: Error 27070008.

If I execute:

# ls -R /mnt

With /mnt containing a MS-DOS file system. The result can't be saved because
of the panic that results. Basically, there are a bunch of "Queue full"
messages, followed by timeout messages, followed by SCSI reset then aborting
the SCSI command messages, followed by 'aiee freeing interrupt', going
down hill from there.

The end result is (usually is) a root file-system that is no longer
recognizable as same, that can't be repaired with fsck. I use the ext2
file system.

I have taken to backing up to an external SCSI disk which I borrow from
a VAXen. When I crash, I boot from a floppy that mounts this file-system
then I do through all the mkfs and tar stuff to restore (not fun), I've
got this recovery down to a science. I can fix everything in an hour!

Interesting that if I use 'echo *' instead of 'ls', the panic doesn't
occur. I have noted that this is probably because 'echo *' doesn't
'stat' the files to get size/date/time information.

The basic reason for the crash is that the SCSI device is being asked
to read beyond the physical end of the media. I verified this by hacking
the code. A sector-number of 0xA0001AFF starts this off with my MS-DOS
file-system. The media only contains 4406960 512 byte sectors, i.e.,
0x433EB0. I think there is a signed/unsigned problem somewhere in
the code........

Cheers,
Dick Johnson
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.20 on an i586 machine (66.15 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-