> On 14 Sep 1998, Philippe Troin wrote:
> ...
> > I also have a lot of scsi tape weirdnesses on 2.1.121. Specifically,
> > stupid mt tricks don't work anymore (mt bsfm gives I/O error
> > sometimes). No panics though using vanilla 2.1.121 on AIC78xxx with
> > Archive Python DAT drive.
> >
> The patch did not touch the bsfm command. If you change the line
> '#define DEBUG 0' in linux/drivers/scsi/st.c to '#define DEBUG 1', the
> driver writes to the console/log more information about the problems it
> encounters. Enabling the verbose SCSI messages in kernel configuration
> also helps.
I'll give it a try later on... assuming we can fix the other problem :-)
> > Plus if I try to dump some filesystems, the dump process hanges on
> > down_failed forever:
> >
> > 100 0 367 366 0 0 1028 608 wait4 S p2 0:02 dump
> > 140 0 368 367 0 0 1052 660 unix_data_w S p2 0:00 dump
> > 44 0 369 368 0 0 0 0 do_exit Z p2 0:00 dump
> > 44 0 370 368 0 0 0 0 down_failed DW p2 0:00 (dump)
> > 40 0 371 368 0 0 1028 616 down_failed D p2 0:00 dump
> >
> This sounds like the problem some people encounter but I have never been
> able to reconstruct (I will try again tonight with dump). The process is
> hanging at down() which probably means that the tape driver is waiting for
> the previously sent SCSI command to finish. There are at least the
> following two possibilities:
> 1. There is a bug in the tape driver so that it will never call up() or
> the SCSI interrupt is lost, or
Likely... (this was very reproductible)
> 2. The SCSI bus is hung.
Since I could still access everything else on the bus, not likely.
> The timeout in the tape driver is very long (900 seconds) and one needs a
> lot of patience in order to find out if the system is waiting for a
> timeout or is really hung. You can make the timeout shorter by either
> editing the driver (change ST_TIMEOUT) or using mt (mt sttimeout xxx).
> A timeout of 60 seconds would probably be enough for a DAT.
I was sure the driver was hung since when I discovered the problem,
the dump processes were hung for at least 2 hours...
Note that I was dumping an ext2 fs on a RAID-0 partition split on two
disks on the same SCSI controller the DAT drive is. Makes 3 devices
active at the same time, plus a bunch of drivers involved (md,
aic78xx, sd, st, ext2). Never had any problems with this setup until
2.1.121 though.
If you need help reproducing the bug, email me.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/