Re: aic7xxx sets CDR offline, how to reset?

From: Doug Ledford (dledford@redhat.com)
Date: Wed Sep 04 2002 - 05:48:06 EST


On Wed, Sep 04, 2002 at 12:37:37PM +0200, Andries Brouwer wrote:
>
> The scsi error recovery has many bad properties, but one is its slowness.

This does not have to be this way. It is a solvable problem.

> Once it gets triggered on a machine with SCSI disks it is common to
> have a dead system for several minutes.

Yes, well, this too is solvable. It, in fact, reminds me that one of the
things I think needs added to the scsi host settings is a default timeout
value for typical devices. Something like adding a default_timeout value
to each Scsi_Device struct and allowing the scsi driver to modify the
value during the slave_attach() call. Then we can put the default timeout
on non-intelligent controllers to something sane while things like
MegaRAID controllers can keep their sky high timeout values.

> I have not yet met a situation
> in which rebooting was not preferable above scsi error recovery,
> especially since the attempt to recover often fails.

This, too, is solvable. It just requires that the scsi subsystem start
paying attention to *how* things fail and making the error handling code
smart enough to know when to retry things and when to just fail.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 07 2002 - 22:00:21 EST