Re: Adaptec SCSI Driver fails during mirroring failover testing (2.2.15/2.3.99-pre6)

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Thu Apr 27 2000 - 16:04:35 EST


Doug Ledford wrote:
>
> > <sigh>....please check the code in question before proclaiming in ALL CAPS
> that it does something that it doesn't. The Adaptec driver doesn't disable
> anything, it also does time anything out nor call for any SCSI bus resets.
> This is all done at the mid layer of the SCSI code, not at the driver level.
>
>

If the SCSI device fails, it should send a test unit ready (0x0),
followed by an inquiry command(0x12) to reprobe just the one device --
not disabling the SCSI bus for every active hard disk the driver is
controlling. The Scripts should be robust enough to reset the bus
without a SCSI manager getting involved or this being propagted to some
upper layer module. The only exception to this would be for non-script
SCSI chipsets (which this one is not).

(I wrote SCSI scripts on NCR chipsets in 1991-1993 for Memorex Telex for
Comm and Disk (PC and Mainframe and S370 Channel) and am very familiar
with how this stuff is supposed to work -- killing a bus because someone
pulls a swappable device out is a poor implementation -- it shouldn't
work this way).

I take it this means that if a single SCSI device ever fails, the SCSI
module in Linux will potentially disable active devices, and mirroring
failover on SCSI may not work correctly on some Linux SCSI drivers. FYI
- The IDE driver works just fine if you unplug the cable from an active
hard disk (I get IO errors, but can recover the system). If I take what
you are telling me here at face value, then mirrored failover on SCSI
may not work without some type of change being made to the SCSI layer.
FYI -- NetWare and Windows 2000 both handle this just fine on the
identical hardware.
 
>
>
> Which is to be expected if your systems goes down unclean. How long did you
> wait before powering it off?

10 minutes.

>
> > On I/O error, or a SCSI command timeout error, the driver should simply
> > return I/O errors to the system and leave the remaining SCSI devices
> > operational on the system.
>
>

To be precise, this is exactly what the aic7xxx driver does, no more, no
less.

Then how do we enable the system not to do this. The SCSI module should
not be disabling active hard disks just because someone pulls a hard
disk out of the chassis on one of the SCSI buses.

BTW. Thanks for responding. :-)

How would you propose I proceed given what you just told me? Is there a
configuration mode I can give Linux to get around this, or is this just
unique to the particular hardware configuration I may be running.
Please advise.

Jeff

>
> --
>
> Doug Ledford <dledford@redhat.com> http://people.redhat.com/dledford
> Please check my web site for aic7xxx updates/answers before
> e-mailing me about problems

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 30 2000 - 21:00:13 EST